Introduction

Blended learning is a pedagogical approach that combines face-to-face teaching with information technology [1,2,3]. Over the last decade, blended learning programs (BLPs) have been widely implemented in institutions of higher education worldwide, and particularly in the field of health professions education (HPE) [4,5,6,7]. In light of the COVID-19 global pandemic, institutions are ramping up their use of BLPs or transitioning to blended and remote learning formats if they have not already done so [8,9,10]. Discussions across institutions of higher education are currently being undergone to highlight the potential and need to keep and refine BLPs even after this pandemic dissipates [11, 12].

Several studies indicate that BLPs are highly effective in providing opportunities for meaningful learning as they enable learners to tailor their educational experiences according to their needs and objectives [1, 5, 13,14,15,16,17,18,19]. BLPs enable learners to control the content, sequence, pace, and time of their learning [5, 18]. Concurrently, BLPs empower educators to effectively guide and monitor learner progress through a learning management system (LMS). These systems allow educators to accurately identify where learners are in relation to the course content and identify potential issues learners may have while progressing through the course [2, 6, 7, 13, 14, 20, 21]. Furthermore, BLPs provide a cost-saving potential for educational institutions in the long run [22]. These advantages of BLPs make them well suited for adult learning [2, 18, 23], especially amidst the current global state in which learners are unable to attend traditional classrooms [11, 12].

However, difficulties arise in adopting BLPs into educational systems [24, 25]. Implementation of BLPs requires an initial large investment in faculty training, time, and money [18]. It appears necessary to rigorously evaluate both the technological platforms and face-to-face educational aspects of BLPs prior to their widespread implementation [1, 18]. Here, usability appears to be one of the most important dimensions of the BLP that needs to be considered and evaluated for [26,27,28].

Usability is a multidimensional concept defined by the International Organization for Standardization (ISO) as the “extent to which a system, product or service can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use” [29]. Often understood simply as “the ease-of-use” of a technology or technological system, the ISO clearly stresses that this idea of usability does not reflect the comprehensiveness nature of this multidimensional concept.

The ISO and other scholars indicate that usability has three major components: effectiveness, efficiency, and satisfaction [27,28,29,30,31,32,33,34]. Through measuring for its components, creators of educational programs can understand if their program is well designed and well received by users [15, 26], facilitates learning [22, 26], and how the program can be improved for the future [20]. Also, the ISO framework indicates that usability can be applied to both the technological (i.e., e-learning platform) and service (i.e., face-to-face component and overall content) aspects of a system, thus making it ideal for the evaluation of BLPs [29,30,31].

Although usability is a highly researched and heavily defined concept in the field of integrated technology [35, 36], its adoption in relation to BLPs appears unclear [26, 34, 37,38,39]. To date, no study addressing the conceptualization and evaluation of usability in BLPs within HPE has been conducted. As such, the purpose of this study was to map current knowledge about and develop a foundational understanding of how usability has been conceptualized and evaluated in the context of BLPs within HPE.

Method

Scoping Review

A scoping review guided by Arksey and O’Malley’s five-stage framework was conducted iteratively over a 2-year period to identify all relevant studies published between August 6, 1991, and August 4, 2020 [40]. The PRISMA-ScR checklist was corroborated [41]. This methodology was used as it is pertinent in answering broad and exploratory research questions which focus on examining the extent and nature of a specified body of research, and identifying existing gaps in the literature [40,41,42,43,44]. Note that this review methodology does not seek to evaluate the quality of evidence, which is a task more akin to the systematic review methodology [40].

Academic Librarian Involvement

The review question, search strategy, and eligibility criteria pertaining to this review were developed in collaboration with three academic librarians at McGill University, where one librarian is an expert in usability, a second is an expert in conducting literature searchers with the concepts of medical and health professions education, and a third is an expert in conducting literature searches with the concepts of family medicine and primary care. A fourth librarian at McGill University assisted in guiding the scoping review process.

Step 1: Identify the Research Question

How has the concept of usability been defined and evaluated for in BLPs within HPE?

Step 2: Identifying Relevant Studies

Scopus and ERIC were searched [45, 46]. Under the guidance of the librarians, it was decided that searching these two databases with a broad strategy would be sufficient for answering the research question. This is because Scopus is one of the largest global interdisciplinary databases which retain studies conducted in the field of health professions education and technology [46] and ERIC is one of the largest global education databases [45]. See Supplementary Appendix 1 and 2 for the full search strategies.

Step 3: Study Selection

All articles were imported into EndNote X8. The first author screened all titles, abstracts, and full-text articles using a questionnaire guide (Supplementary Appendix 3). The fourth and fifth authors functioned as second reviewers, each completing a portion of the title, abstract, and full-text screening using the questionnaire guide. The third author validated 36% of the included studies to ensure that they matched the eligibility criteria.

Inclusion Criteria

Studies had to empirically evaluate both the online and in-person aspects of a BLP within HPE. Learners must have been the primary evaluators for the BLP.

This study adopts the definition of blended learning established by Watson, where BLPs must utilize asynchronous online learning methods to deliver approximately 30 to 79% of the educational content [21]. The technological component must have been accessible outside of the typical teaching environment (i.e., at home). Studies must describe the synchronous and asynchronous components of their BLP or provide some indication about the number of hours of learning that each component of the BLP took. Studies must indicate that learner use of online material was tracked (i.e., an LMS was utilized).

For the definition of HPE, we refer to any undergraduate or graduate education provided to students in health professional programs, or continuing education and faculty development training provided to practicing health professionals [47]. This study adopts the term “learner” to amalgamate both trainees and trained professionals taking part in a BLP within the field of HPE.

Studies had to be written in English, come from a peer-reviewed journal article, and be published and indexed between August 6, 1991 (as this is when the world wide web went live, making the creation of BLPs a possibility) and August 4, 2020 (as this is when the last search was performed).

Exclusion Criteria

Studies using CD-ROMs, DVDs, and other downloadable software as their primary mode of asynchronous delivery were excluded, as this is not online learning. Studies that utilize simulation centers or computer labs as their primary technological component were excluded, as these may not be accessible to students outside the typical teaching environment. Studies that pertain to learners that do not directly provide care to human patients, such as veterinarians, were excluded [48]. Studies that evaluated the BLPs of undergraduate courses that are not predominantly delivered to HPE learners, such as general biology or psychology courses provided to all students in a Faculty of Science, were excluded. Studies that solely conducted evaluations by BLP instructors were excluded.

Step 4: Charting the Data

The first author entered extracted data into a form developed in Microsoft Excel, Version 16.0. Appropriateness of the charting form was discussed through consultation with co-authors and academic librarians. The fourth author validated the charted data.

Step 5: Collating, Summarizing, and Reporting the Results

Descriptive Quantitative Analysis

Charted data were synthesized through tabulation. Descriptive statistics were estimated to depict the nature and distributions of the trends found through tabulation.

Qualitative Content Analysis

To understand which components of usability were evaluated for in each study, directed content analysis was utilized as described by Hsieh and Shannon, 2005 [49]. ISO definitions of each component were used as a guiding framework [29]. Summarized ISO definitions of each usability component and our associated framework for identifying relevant codes can be seen in Table 1. Findings were validated by the fourth author.

Table 1 ISO definitions of usability components and examples of coding
Qualitative Thematic Analysis

Inductive semantic thematic analysis was conducted as discussed by Braun and Clarke, 2006 [50]. All retained materials for this review were imported into QSR’s NVivo 12 and coded independently by the first and fourth authors. Reviewers met three times during the coding process; at the 10%, 36%, and 100% marks. Potential themes were discussed in the final meeting. Themes were then further developed and consolidated through meetings with the entire research team. A deductive analysis using the ISO framework for usability was then conducted to assist in further analyzing and bringing structure to the data.

Results

Eligible Studies

A total of 9632 titles were identified after the removal of duplicates. Title and abstract screening left 696 studies to be full-text reviewed. Ultimately, 80 studies were retained. Figure 1 provides a summary of the included and excluded studies. Table 2 includes a detailed summary of extracted data from all retained studies.

Fig. 1
figure 1

PRISMA flow diagram of included/excluded studies

Table 2 Synthesized table of extracted data for studies that met eligibility criteria

Quantitative Findings

BLPs were implemented globally and for various HPE populations. One study explicitly evaluated the overall concept of usability. Most studies did not follow or refer to any formal evaluation framework. Only four studies (5.0%) referenced the 4-level Kirkpatrick model of evaluation. Fifty-five out of 80 studies (68.8%) explicitly evaluated to see if a change in learner attitudes, knowledge, skills, and/or overall learning was achieved. These evaluations were often done through a pre-post test design.

Sixty-nine out of 80 studies (86.3%) utilized some sort of questionnaire, survey, course evaluation, or feedback tool (henceforth instrument). Only 30 of the 69 studies (43.5%) discussed if their instrument was reliability tested, standardized, or validated. Among these 30 studies that discussed validation, reliability testing, or standardization, instruments utilized in these studies were not identified as being specifically developed to evaluate aspects of BLPs. Rather, these instruments were developed to measure concepts such as “communication” or “learning” in general, and not specifically within the context of a BLP.

Seven out of 80 studies (8.8%) used focus groups to evaluate their BLP. One out of 80 studies (1.3%) used semi-structured group interviews. Twenty-two out of 80 studies (27.5%) used qualitative techniques to analyze data obtained from open-ended questionnaires, learner feedback, or learner reflections. Figures that further illustrate the nature and extent of BLP evaluation across HPE can be found under Supplementary Appendices 4, 5, 6, 7 and 8.

Directed Content Analysis

All studies were found to evaluate for one or more of the usability components (effectiveness, efficiency, or satisfaction) as defined by the ISO. However, it must be noted that scholars did not always explicitly label the component they focused their evaluation on, in their studies. For example, scholars would often list the purpose of their study to evaluate the effectiveness of a blended learning program; however, when assessing their methods, questionnaires with items focused on learner satisfaction were often identified.

Inductive Thematic Analysis

Theme 1: Avoiding the “Usability” Label and Using Undefined Related Terms Instead

Although only one study employed the term “usability” when reporting their BLP evaluation, scholars across the data set adopted several terms and concepts which were identified as related to usability and its major components, for instance, “appropriate” [51,51,52,54],“beneficial” [55],“clear” or “clarity” [52, 56,56,57,58,59,60,61,62,63,64,66],“easy” or “ease-of-use” [57, 62, 65,65,66,68],“efficacy” [59, 60, 69, 70],“favourable” [67],“flexibility” [51, 54, 56, 57, 62, 64, 67,67,69, 71,71,72,73,74,75,76,77,78,79,80,81,83], “help,” “helpful,” or “helpfulness” [56, 59, 66, 83,83,85],“informative” [74],“useful” or “usefulness” [60, 62, 67, 70, 71, 78, 86,86,87,88,89,90,92],“utility” [61], and “worthwhile” [61].

What is more, interpretations of these words were found to be quite ambiguous and dissimilar across studies. For example, refer to the following two excerpts:

“The students preferred learning the course online since the crowded setting decreased the laboratory’s effectiveness and access to the resources out of the class was useful for their learning. ‘It was like a course for us. It is like we are taking a class for 1 or 2 more hours at home. From this point, it was very useful’” [62] [italics are mine].

And:

“After completion of their ICA in geriatric medicine, 88% of the students agreed that WebCT was a useful tool for this rotation. When the students were asked about their perceptions of the use of a paper-based portfolio, 68% agreed that they felt comfortable using it whereas 16% somewhat disagreed with this statement” [71] [italics are mine].

In the first excerpt, authors associate the number of students in a room with effectiveness and discuss access as related to usefulness. A student’s response in this excerpt relates usefulness with time and location. However, in the second excerpt, usefulness is broader. It seems to be related to the perceived importance or effect, as well as the comfort that students have in relation to different aspects of the BLP. This example highlights that without the reference of an explicit definition or a framework that guides the use of terminology, ambiguity in the interpretations of terms and concepts results between scholars.

Theme 2: Confusing Conceptualization of Usability’s Primary Components

The conceptualization of the terms effectiveness, efficiency, and satisfaction differed significantly across studies. This theme captures and highlights the ambiguity surrounding the way in which these terms were discussed across retained studies.

Effectiveness

Effectiveness was explicitly used in most studies. Although widely used, the interpretation of this word was seldom unanimous. No study provided a framework to define this term but often associated the term with unique ideas or concepts specific to each study. For instance, refer to the following two excerpts:

“Our results are in agreement with those of other studies on the effectiveness of e-learning as part of blended learning, which showed that students’ engagement was increased, and their perception of the educational environment was improved…Thus, although the use of technology in teaching is effective and is perceived as such, it requires a cultural change in learning practice that might not be easy for everyone” [79] [italics are mine].

And:

“When the findings on the effective learning of [the BLP] were analyzed, students stated that the images made learning long-lasting; made learning easy for the students; and helped the students get prepared before the class…The students preferred learning the course online since the crowded setting decreased the laboratory’s effectiveness…” [62] [italics are mine].

In the first excerpt, effectiveness is related to the concepts of learnersengagement and perception, whereas in the second excerpt, effectiveness is related to several different concepts including permanence of learning, ease-of-learning, and assistance with pre-class preparation. The second excerpt goes on to relate effectiveness to the number of learners taking part in an activity.

Efficiency

Although the qualitative content analysis in Table 2 indicates that studies often discussed the concept of efficiency, studies seldom used the label efficiency. However, even when scholars explicitly used the term efficiency, the connotation they applied to it differed. For example, refer to the following two excerpts:

“Although a well-crafted and captivating lecture presentation seems like an efficient way for an instructor to cover course content, converging evidence implies that listening to a classroom lecture is not an effective way to promote deep and lasting student learning” [77] [italics are mine].

And:

“Students frequently claim that they prefer podcasts to real-time instruction because they can both speed up the podcast…as well as review portions of podcasts that they need to see again. They view this as more efficient” [93] [italics are mine].

In the first excerpt, efficiency is related to the idea of being well crafted and captivating, whereas the second study discusses efficiency as related to the concept of time and the review function of online material.

Satisfaction

Satisfaction was discussed across most retained studies, though not always explicitly. Often, the words “positive” and “negative” were used instead. The majority of the ambiguity in relation to this concept arises through two specific issues. Firstly, each study applied different connotations to satisfaction. For example, some studies discuss satisfaction as a concept that is used to measure effectiveness [51, 60], whereas others discuss satisfaction as related to the concepts of learner attitudes and experiences [80]. Secondly, studies differ regarding what their focus of evaluation was (i.e., either on the satisfaction of specific components of the BLP or the entire program in general).

In sum, a great deal of ambiguity in the connotations of effectiveness, efficiency, and satisfaction can be seen across all retained studies.

Theme 3: Lack of Consensual Approach to Usability Evaluation

The conceptual and definitional ambiguity concerning usability and its major components highlighted in the first two themes was accompanied by an absence of a consensual approach to its evaluation. Each study adopted a unique set of methods and instruments to evaluate a unique set of concepts. Some studies attempted to complement subjective measures (i.e., learner perceptions) with objective measures (i.e., changes in grades), while others focused evaluations on only one type of measure. Some authors included open-ended responses in their instruments. In these cases, studies differed in their analysis, where some used thematic analysis, others used content analysis, and some did not specify their method of analysis. Each study also had a different focus of evaluation, where some studies separated the evaluations for asynchronous and synchronous components, while others evaluated the BLP as a whole.

These issues might be present because most studies do not utilize established frameworks to guide their evaluations. In fact, only 4 studies referenced the 4-level Kirkpatrick evaluation model [51637894], which is considered to be a widely used model for educational program evaluation [95, 96]. However, of these studies, one completed only the first level of evaluation, another completed the first and second level of evaluation, a third completed the first to third levels of evaluation, and the fourth discussed “…an attempt to assess all 4 levels of Kirkpatrick’s evaluation framework…” though the depth to which this was done is questionable.

Deductive Thematic Analysis

The ISO framework for usability allowed for easy replacement of ambiguous terms with the labels: usability, effectiveness, efficiency, and satisfaction. Alongside these labels, this analysis also revealed that scholars consistently utilized the terms accessibility and 'user' experience, two concepts which the ISO describe as being critical and closely related to usability [29]. Refer to the following excerpt:

“The feature of the course that participants liked most was the eLearning modules. They found them very interactive, creative, easy to understand, and useful in addressing multiple learning styles. Participants appreciated the accessibility and self-paced nature of the eLearning modules. The participants also valued the peer-reviewed journal readings and reported that these readings complemented the material presented in the modules and reinforced current practice issues and evidence-based practice. Participants reported that the discussion forums, which were another interactive part of the course, allowed nurses an opportunity to share opinions, knowledge, and practice experiences” [89].

In this excerpt, effectiveness can be interpreted from “easy to understand”; satisfaction can be interpreted from statements such as “participants liked/appreciated/valued…”; efficiency can be interpreted from statements regarding timing and resources (i.e., “self-paced nature”); accessibility is explicitly discussed in this excerpt; and learner experiences are also explicitly discussed (i.e., “participants reported that the discussion forums…”).

Moreover, this deductive analysis identified 22 concepts associated with usability that were applied consistently across all 53 studies. These concepts include change in knowledge, skills, and perceptions, to name a few. Many of these concepts were interpreted through the items present in the evaluation instruments of retained studies. These concepts were amalgamated into a conceptual map (Fig. 2).

Fig. 2
figure 2

Concept map developed from the deductive findings of the thematic analysis

Discussion

In this scoping review, we aimed to map current knowledge about how usability has been conceived and employed in the evaluation of BLPs within HPE. Only one study was found to explicitly apply the term usability [97]. In this study, usability seems to have been evaluated as one item in an instrument. As such, the complexity and depth of this concept seems to have been neglected. Though usability has been identified by scholars in HPE as being an important aspect of educational programs that involve technology [26, 27], its explicit and appropriate implementation has yet to be fully achieved in this field of inquiry. The lack of uptake may be due to confusion, ambiguity, or lack of knowledge associated with usability and its various definitions [26,27,28,29,30,31,32,33,34]. Usability is often understood simply as “the ease-of-use” of a technology or technological system. However, the ISO indicates that this idea of usability does not reflect the comprehensiveness nature of this multidimensional concept [29]. The ISO framework for usability serves well to facilitate the clarification and implementation of this concept into the field of HPE by elucidating its comprehensive nature and its ability to evaluate both the technological and educational components of BLPs.

Through applying the ISO framework to guide the content analysis conducted in this scoping review, it was noted that not all studies explicitly discuss evaluations of effectiveness, efficiency, and satisfaction, and that the connotations ascribed to these terms possibly varied across studies. In the inductive phase of the thematic analysis, the depth of the disparity in the labels and conceptualization of terms used across studies was captured. Across studies, scholars seemed to apply different terms to describe the same concepts. For instance, to evaluate their BLPs, some scholars employed the term helpful, whereas others employed the term useful, and both of these are potentially attempting to measure the concept of effectiveness. Additionally, when scholars adopted the same terms, they applied different connotations to them. For instance, in one study, the term useful was related to the idea of access, whereas in another study, the term useful was related to the idea of perceived importance and effect. The lack of a common language to facilitate evaluations is a potential threat to the comparability and generalizability of BLP evaluative studies in the field of HPE. Moreover, this lack of commonality in language, in addition to a lack of uptake of an evaluative framework, may serve as a reason for why so many different methods are used to evaluate BLPs.

In the deductive phase of the thematic analysis, the framework for usability established by the ISO, a global organization that develops standards through the collaboration and consensus of international experts [29, 31], served to overcome the disparity in the language employed by scholars. Adoption of this framework demonstrated that although scholars do not explicitly make use of the term usability, they are in fact evaluating and describing this concept, albeit implicitly.

We also identified that, in the current context, the concept of usability extends beyond the three major components of effectiveness, efficiency, and satisfaction. Specifically, accessibility and user experience were identified as being closely associated concepts to usability. A concept map that consolidates and clarifies the relationships between the major and associated components of usability, as well as the 22 most common concepts that were evaluated within the retained studies was generated through this review (Fig. 2). This figure depicts that usability is often implicitly evaluated through a focus on one or more of its major components. Bi-directional arrows can be seen between usability and both accessibility and user experience. This indicates that the connotations that scholars often provided to the labels “user experience” and “accessibility” were essentially interchangeable with that of the definition of usability. This conceptual map elucidates a practical application of usability for BLPs in HPE literature. Through application of this map, the potential to begin comparing and contrasting different BLPs emerges.

Moving forward, we suggest that scholars conducting evaluations of BLPs within HPE must adopt a common lexicon and set of concepts to be evaluated to begin establishing the comparability and generalizability of these studies. In this regard, usability can serve well as a multifaceted concept that not only clarifies what is meant by terms such as effectiveness, efficiency, satisfaction, user experience, and accessibility, but also begins to consolidate how evaluations for each of these terms may be conducted (i.e., what sub-concepts relate to these domains of usability).

The explicit adoption of usability in HPE may be facilitated with an instrument to measure this concept in its entirety. Notably, many usability evaluation instruments exist in relation to e-learning programs or platforms such as the LMSs [39, 98,98,99,100,102], though, no instrument developed specifically for usability evaluation of BLPs in HPE was identified. Thus, our research team will use the findings of this review to work on developing and initially validating a comprehensive instrument to measure usability in this context. This instrument will assist in establishing a systematic evaluation procedure, which may lead to increased similarity in the terms, connotations, and concepts that are being measured across BLPs in HPE. This in turn may assist in increasing the comparability and rigor of evaluative studies in this context, and thereby, the systematic improvement of BLPs in HPE. As the application of BLPs continues to rise, evaluating the usability of these initiatives will ensure that they are well designed, well received by learners, facilitate learning, and can be systematically improved.

Limitations and Strengths

This review was the first to evaluate the application of usability within HPE. Only two literature databases were searched. The use of additional databases specific to various health professional disciplines may have assisted in identifying relevant studies. However, the broad search strategy was validated by several academic librarians, implemented iteratively over a 2-year period, and was used to cover an extensive range of HPE initiatives around the world. A major strength of this review is the use of rigorous methods to analyze the data, particularly a deductive analysis that brought clarity to the large discrepancy identified in the inductive analysis.

Conclusion

BLPs are being implemented in the field of HPE globally. The introduction of these programs has been further increased due to the COVID-19 pandemic. To ensure that learners can benefit from BLPs, they must be evaluated and systematically improved over time. Our findings indicate that the comparability and generalizability of BLP evaluative research in HPE appears compromised. Critical concepts such as effectiveness, efficiency, and satisfaction are often poorly labeled or conceptualized across HPE literature examining BLPs. This is coupled with a lack of uptake of, or reference to, established evaluative frameworks to guide scholars. Ultimately, the concepts and methods used to evaluate BLPs in the field of HPE are disparate. However, adoption of the ISO framework for usability addresses these issues by establishing clear definitions for scholars to consider with respect to various evaluative concepts. Also, scholars already seem to be discussing and evaluating for usability, though implicitly. Explicit acceptance of this framework may facilitate the adoption of a common language among scholars conducting BLP evaluations in HPE. A conceptual map that clarifies the consideration of usability evaluation in the current context provides a foundation for the future development of instruments to evaluate usability in BLPs within HPE. This may facilitate the adoption of a common set of methods or frameworks, which ultimately may allow for increased rigor and systematization of BLP evaluations in HPE, factors that may determine the overall utility, impact, and value of these programs amidst and beyond a global pandemic.