Introduction

Electronic response systems (“clickers”) are a common polling tool for facilitating active learning, especially in large lecture-based courses, and their adoption has been much higher in STEM and medical fields than in other fields (Gardner et al. , 2018). A large body of research has shown the benefits of clickers, such as increased learning (Knight and Wood, 2005; Smith et al., 2011; Armbruster et al., 2009) and engagement (Lane and Harris, 2015). While flash cards and clickers can be used in similar ways (Lasry, 2008), electronic systems are often preferred since they record the students’ answers and allow instructors to check attendance and assign points based on correctness and/or participation.

Over the past decade, mobile or web-based apps have become available as a low-cost alternative to dedicated handheld devices, and each technology has advantages and disadvantages (Koenig, 2020). The choice of which technology, or whether an instructor adopts any polling technology, depends mainly on the perception of usefulness in their instructional context, but also on other factors such as ease of use or instructional support (Aljaloud A et al. , 2019). Standalone clickers are appreciated for their ease of use, but students usually have to purchase their own clickers, and there is also some cost for the institution to maintain hardware and provide training. By contrast, web-based polling devices don’t require institutional support, but rely on students having access to mobile devices. As it cannot be assumed that all students are able to bring a smartphone, tablet, or laptop to class, instructors will need a few devices that can be loaned to students who need them. Another important concern regarding the use of mobile or web-based technology is their reliance on having access to the internet, which presents a significant potential for distraction (Flanigan and Kim, 2022; McCoy, 2016, 2020).

Previous studies show that students regularly use their devices to engage in activities that are not related to what is happening in the classroom (McCoy, 2016, 2020) and that such digital distraction can have a negative impact on performance (Duncan et al., 2012; Berry and Westfall, 2015; Kuznekoff et al., 2015; Amez and Baert, 2020; Rosen et al., 2013; Junco, 2012; Kraushaar and Novak, 2010; Flanigan and Titsworth, 2020). In this context, one may wonder if requiring students to use their mobile devices in class and having them always available on their desks increases the level of distraction.

A study by Ma et al. (2020) suggests that this might be the case. They found that, following an instructional episode that required the use of mobile devices, many of the students continued to use their device for off-task activities afterward. To probe this question further, we conducted an observational exploratory study in eleven sections of three first-year undergraduate physics courses which were using active-learning pedagogies. Our primary research questions are as follows:

  1. RQ1:

    Comparing courses using similar pedagogies facilitated by mobile devices or standalone clickers, does the use of mobile or web-based apps lead to a higher level of distraction in first-year undergraduate science courses?

  2. RQ2:

    What are the differences in levels of distraction when students are engaged in active-learning activities (e.g., when completing a worksheet task or discussing a conceptual question) versus passive-learning activities (e.g., listening to the instructor lecture, watching a demonstration)?

To our knowledge, this is the first study that has measured the impact on the level of digital distraction due to requiring the use of mobile digital devices for classroom polling rather than using standalone clickers. Our study contributes to a growing body of literature that aims to find out in what ways mobile devices influence students’ learning in the classroom and how negative effects due to digital distractions can be mitigated. In particular, we provide evidence that using active learning pedagogy in our undergraduate physics courses does indeed lead to lower levels of digital distractions as is often suggested in the literature (Flanigan and Kim, 2022).

In the following sections, we will discuss some of the relevant literature, describe our methods, and then present and discuss our results.

Relevant Literature

A recent book by Flanigan and Kim (2022) provides a comprehensive overview of the current understanding of digital distraction, which the authors define as the “use [of] mobile devices during class for non-class related purposes.” The interested reader will find a wealth of useful information and numerous references on important topics such as why students engage in digital distraction, how often, and what factors could mitigate the amount of distraction. Here we focus on two issues that are directly related to our study, i.e., (a) what fraction of time students use their devices for non class-relevant activities and (b) which external factors influence the level of distraction, in particular the format of instruction.

Regarding the question of how frequently students are distracted by their devices, we narrow our discussion to the use of mobile phones, since the students in our study rarely used laptops in class when we conducted our observations. Kim et al. (2019) used a tracking app to measure students’ smartphone use in class and reported that students are typically distracted every 3–4 min for over a minute (28.1% of class time). The authors also administered a survey and found that students underestimated how often they are distracted in a typical class period (approximately 8 times when self-reported vs 12 times when measured), but overestimated the time they spend on each distraction in comparison to their measured data (approximately 400 seconds when self-reported vs 120 seconds when measured). Similarly, (Duncan et al., 2012) observed an average phone use of almost seven times per class as opposed to the self-reported average of three times from their student survey. Using self-reported data from a student survey, (McCoy, 2020) determined that students spent 19.4% of class time on digital distractions. For the time period when our data was collected in 2014/2015, (McCoy, 2016) conducted a similar survey in which undergraduate students estimated spending on average 20.9% of class time off-task on their digital devices. Based on these studies and the uncertainties associated with self-reported data, (Flanigan and Kim, 2022) estimated the typical amount of distraction by mobile phones in the range of 25–30% of class time.

Among the external factors that influence the level of digital distractions, the format of instruction and classroom activities are of direct interest for our study. Passive formats characterized by long stretches of lecturing have been linked to boredom, which, along with staying connected, are the main reasons students give when asked why they engage in digital distractions (Flanigan and Kim, 2022; McCoy, 2016, 2020; Varol and Yildirim, 2019; Parry and Le Roux, 2018). A direct connection between instructional format and the amount of engagement was made by Lane and Harris (2015) who used a newly developed engagement observation protocol and measured the fraction of students that were engaged during different classroom activities. They found that almost all students were engaged during clicker questions, but less engaged when the instructor lectured or summarized learning goals. Berry and Westfall (2015) asked undergraduate students which factors made it more likely for them to use their phones for distractions and found that the major factors are class size (more likely in larger classes), duration (more likely in long classes), timing (more likely at the start and the end of class), type of course (more likely in courses outside the chosen major), and classroom activity (less likely during discussions and group activities). In agreement with these findings, (Ragan et al., 2014) reported high levels of distraction in a long weekly evening lecture (2 h and 45 min) delivered in a traditional lecture format. Accordingly, one of the recommended strategies to curb digital distractions is the increased use of active learning pedagogies and the integration of mobile devices as polling devices (Flanigan and Kim, 2022; Berry and Westfall, 2015; Flanigan and Babchuk, 2020).

Our study provides additional perspectives on the amount of digital distraction and how instructional formats can influence digital distraction. While we are not investigating the impact of distractions on learning, there seems to be an overall consensus that digital distractions have a negative impact. Support for this assertion comes from two types of studies: (a) correlations between digital distractions and grade point average (GPA) (Duncan et al., 2012; Berry and Westfall, 2015; Amez and Baert, 2020; Rosen et al., 2013; Junco, 2012; Kraushaar and Novak, 2010; Flanigan and Titsworth, 2020) and (b) experimental studies that measure the impact directly (Sana et al., 2013; Wood et al., 2012; Kuznekoff et al., 2015; Gupta and Irwin, 2016; Aaron and Lipton, 2018). Readers interested in this topic will also find a summary in Flanigan and Kim (2022) with more references. In the next section, we will present observational data that establish at what level digital and other distractions are present in our active learning courses. We will compare course sections that use mobile technology as polling devices to similar sections that use standalone clickers and discuss implications for instructors. We will also present data that compare active learning modes to passive modes.

Methods

Experimental Design

In this observational exploratory study, classroom data were collected during 2014 and 2015 across multiple courses and instructors at a large upper-tier Canadian research university to compare student distraction levels between courses using the dedicated iclicker polling remotes (“Clickers”) or the Learning Catalytics electronic polling system on their own electronic devices (“Mobile Polling”). Three large multi-section introductory-level physics courses were observed: an algebra-based introductory physics course with a diverse student population and a significant fraction of non-science majors (“Algebra-based,” three lecture sections), a calculus-based introductory physics course taken mainly by life-science students (“Calculus-based 1,” five lecture sections), and a calculus-based introductory physics course in electricity and magnetism taken mainly by physical science students, including approximately 15% future physics majors (“Calculus-based 2,” three lecture sections). The lecture sections had 150–275 students and were taught by different instructors. All courses can be characterized as using a similar, highly interactive approach, with 38% of the overall observations in our data representing time spent on worksheet problems and conceptual “clicker” questions.

To measure the level of student distraction we used a modified version of the BERI protocol (Lane and Harris, 2015). In the BERI protocol, 10 students sitting in close proximity to each other are observed every two minutes or when there is a major change in classroom activity. During an observation snapshot, the number of students is recorded that fall into each of the six engaged (listening, engaged student interaction, etc) or six disengaged (off-task, disengaged computer use, etc) categories. The observer also records what is happening in class at the time each observation snapshot is taken (e.g., “instructor explains answer to clicker question”). In our implementation, the time between observations snapshots was 2 to 5 min, where observers used the shorter intervals as they gained more experience. Additionally, we simplified the student engagement categories:

  • Front engagement: listening to instructor, taking notes, observing demo, etc.

  • Local engagement: discussing clicker questions with neighbors or working on worksheet.

  • Digital distraction: distracted by non-class related use of cell phone, tablet or computers.

  • Other distraction: distracted by friends, doing paper-based homework for other courses, reading a newspaper, etc.

We removed the “uncertain” category used in Lane and Harris, and students were marked down as engaged in hard-to-distinguish cases.Footnote 1 By removing the uncertain category from our protocol, we eliminated one of the main sources of disagreement in inter-rater checks in the Lane and Harris study. We opted not to use a co-observer training session model due to the very high inter-rater reliability (96.5%) reported by Lane and Harris. The simplified observer training consisted of being provided with a detailed oral overview of the use of the observation protocol by one of the authors (GR or FR). We emphasized to all observers that uncertain cases were to be marked as engaged. Observers reported back that our protocol was easy to use and they felt confident in their observations.

Most instructors were observed multiple times throughout the term, and the groups of students were chosen to be from different locations within the lecture hall to get a representative sample.

Data Structure and Data Filtering

The primary response variables were the fraction of observed students in a given observation snapshot recorded as unambiguously distracted by a digital device (DistractionDigital) or by any means (DistractionAll). The term observation session (Session) refers to the collection of observations snapshots collected by an observer in one lecture.

Data were collected from eleven lecture sections (Section) spanning three large multi-section introductory-level physics courses. Nine of these sections were taught by individual instructors, and two of them were taught using a collaborative paired teaching model (Strubbe et al., 2019), where two instructors share teaching responsibilities for that section and both are always present during lectures.

Of the initial 50 observation sessions, the three following ones were filtered from the data set, for the following reasons, respectively:

  • Omitted Observation Session 1: A mobile polling session where the Wi-Fi in the classroom was not working that day, so the instructor was unable to use the tool as intended.

  • Omitted Observation Session 2: A clicker session observed by a one-time observer who recorded an activity code for only one of their ten observations during the session.

  • Omitted Observation Session 3: A clicker session observed by one of the authors (FR). When using the average overall distraction-level for each session to look for outliers, this session was flagged by R’s boxplot.stats function. This session had an average overall distraction level of 53.9%, which is our only observation session with an average above 50% and is 4.3 standard deviations above the average distraction rate for each session and 3.3 standard deviations above the average distraction rate for all other sessions in that section. Additionally, this observation session was for a lecture during the final week of classes, which anecdotally is considered to be a time where students tend to be exhausted and thus quite distracted. However, to ensure the robustness of the results and conclusion, we repeated all analyses with data from this observation session included and compared the results with those from the filtered data.

After filtering out the above observation sessions, our data set consisted of 47 observation sessions across 9 sections, where our use of observer and section as random effects variables within our regression models allows us to control for variance due to observer and section effects. This filtered data set was collected by 13 observers, with 9 of those observers observing more than one session and 7 of those Observers observing more than one section.

Based on the notes recorded by the observers, our observation snapshots were categorized into active (students answering or discussing polling questions, students working on worksheet questions), passive (instructor lecturing or performing a worked example, instructor showing a video or demonstration, instructor following-up on a polling or worksheet question), large-group (students asking or answering questions in front of the entire class), other (communicating administrative details, quizzes), and missing. Because active and passive codes made up 95.1% of the data set and provide a meaningful comparison of instructional modes, we chose to perform the analysis using only observation snapshots that correspond to those two codes.

After selecting only for active and passive codes, our final filtered data set consisted of 387 clicker observation snapshots across two courses and five sections, and 208 mobile polling observation snapshots across three courses and four sections. This breadth of contexts gives us confidence that these results can be viewed as characteristic for a typical first-year physics course at our institution.

Results

Data Descriptives

To gain a sense of the overall distraction levels, we contrast the aggregate distraction data for the clicker (387 observation snapshots) and mobile polling (208 observation snapshots) tools using the filtered data set (13 observers across 9 sections keeping only active and passive activity codes), as shown in Fig. 1. A Welsh two-samples t-test reveals no difference in digital distraction levels between the clicker and mobile polling tools \((t(408.3) = -1.28, p =.20)\), as well as no difference in overall (digital + other) distraction levels between the tools \((t(444.2) = 0.35, p =.73)\). When combining the data across the tools, we see that the average fraction of students recorded as being unambiguously distracted by digital devices (DistractionDigital) is 0.194 (standard error = 0.006) and the fraction of students recorded as being unambiguously distracted by any means (DistractionAll) is 0.256 (standard error = 0.008).

Fig. 1
figure 1

A comparison of the distraction levels, during each observation snapshot, between courses using mobile polling and clickers. The fraction of students in the filtered data set (see Data section), recorded as being unambiguously distracted by digital devices (left panel) or unambiguously distracted by any means (right panel) for each observation snapshot in classrooms using clickers or mobile polling. The error bars represent standard errors of the mean values

Table 1 Summary of fixed-effect and random-effect variables considered for the regression model

Using Linear Regression for Further Comparisons of Distraction Levels Across Tool and Activity Types

We use a multimodel inference approach in our linear regression analyses, as described in Burnham and Anderson (2002). This approach uses the second-order Akaike information criterion (AICc), a relative goodness of fit measure that accounts for model complexity, to rank a set of candidate models, and assign these models weights based on their relative plausibility. This approach benefits situations where no candidate model is clearly superior to the others in the set. It allows us to account for our uncertainty in which model is best by using model-averaging to determine the estimates of the fitting parameters using the relative plausibility weights that arose from the comparison of AICc values.

Our analyses use mixed-effects linear regression models, which include both fixed-effect variables (those which are fitting parameters of interest) and random-effect categorical variables (those whose variance needs to be accounted for in the model, but where comparison between their categorical levels is not necessary). The random-effect variables are used to control for nesting in the data or the presence of repeated measures. Table 1 summarizes the fixed- and random-effects variables considered when building the models. We will describe the process of determining the model-averaged estimates of the fitting parameters for the DistractionDigital outcome variable (the fraction of students unambiguously distracted by digital devices) in full detail and then summarize the results for the DistractionAll outcome variable (the fraction of unambiguously distracted students).

Because Tool is foundational to our research questions, the fixed-effect variable must remain in all candidate models. Similarly, ActivityType is a critical descriptor of what type of activity the students are engaged in during a given observation snapshot and, as a result, must remain in all candidate models as well. Thus, the candidate models being compared will vary only in their random-effect variables, which are the variables that control for the structure of the data. With each candidate model including a minimum of one of the four possible random-effect variables, we start with fifteen candidate models.

Based on the criterion of lowest AICc, the most plausible model, and thus the model that will be given the greatest weight in the model-averaging processes, is

$$\begin{aligned} DistractionDigital \sim Tool \!+\! ActivityType \!+\! Section \!+\! Observer \!+\! Session. \end{aligned}$$

Although the Location random-effect variable does not appear in this model, it does appear in two of the five most plausible models, as described below. We focus our reporting on candidate models whose AICc values are at most 10 higher than that of this most plausible model, which we define as \(\Delta \)AICc, because models beyond this threshold are considered to have essentially no empirical support in their plausibility as compared to the most plausible model (Burnham and Anderson, 2002). This threshold can also be justified by the fact that, from inspection, the relative model-averaging weights are negligible for candidate models beyond this threshold. Table 2 summarizes the five models under this \(\Delta \)AICc threshold, their relative model-averaging weights, fixed-effects variable estimates for each model, and the overall results from the model averaging.

Table 2 Summary of the most plausible candidate models for predicting DistractionDigital across 595 obsevation snapshots

These results show that tool provides an insignificant amount of predictive power for DistractionDigital, the fraction of students recorded as being unambiguously distracted by digital devices. Specifically, the overall result based on the model-averaged parameters is that classes using mobile polling have a fraction of students recorded as distracted that is 0.001 (standard error = 0.042) higher than classes using clickers, an overall result which is consistent with no observed difference. In contrast, we see a statistically significant difference in digital distraction associated with activity type, with active activities resulting in the fraction of distracted students being 0.067 (standard error = 0.011) lower than passive activities, as an absolute difference. To provide a bit more context for this value, the intercept of 0.233 from this model is the fraction of students recorded as distracted during passive activities in classes using clickers, and the fraction of students recorded as distracted during active activities in these classes is 0.166 (0.233\(-\)0.067).

We also determine that there is a small-to-medium effect size, \(f^2 = 0.075\) (Cohen, 2013), for the ActivityType parameter. When using mixed-effect models, the most appropriate effect size measure for a fixed-effect parameter is Cohen’s local \(f^2\), which uses the conditional coefficient of determination goodness-of-fit measure (Nakagawa and Schielzeth, 2013), \(R^2_c\). It does so by comparing \(R^2_c\) for the mixed-effect model to \(R^2_c\) for the same model after removing the parameter of interest, ActivityType.

This same analysis is repeated for the DistractionAll outcome, the fraction of students recorded as being unambiguously distracted by any means, and the model given the greatest weight in the model-averaging process, is

$$\begin{aligned} DistractionAll \sim Tool + ActivityType + Section + Session. \end{aligned}$$

Similar to the model-averaging for the DistractionDigital outcome, there are five models for the DistractionAll outcome that meet the goodness-of-fit criterion of \(\Delta \)AICc \(\le \) 10 relative to the model with the lowest AICc. The overall result again shows a lower fraction of students distracted (0.074, standard error = 0.013) during active activity types as compared to passive ones. And again, Tool had no statistically significant predictive power, with classrooms using mobile polling having a fractional distraction level that is only 0.014 (standard error = 0.041) lower than classrooms using clickers. Whether looking at distraction from only digital sources or from all sources, there was no measurable difference in distraction levels between classes using mobile polling and clickers.

As discussed in the Data section, we performed a robustness cross-check for the analysis by re-running the analysis with the inclusion of Omitted Observation Session 3, which had initially been omitted due to being identified as a significant outlier. Specifically, this was a clickers session where an unusually high level of distraction was observed relative to the typical levels of distraction in that section or in any section. A re-analysis which included this section provided results that were consistent with the original results, with model-averaged parameters showing a statistically significant difference of \(-0.065\) (standard error = 0.011) for DistractionDigital for Active activities as compared to Passive ones, and no statistically significant difference for Tool. The results for the DistractionAll outcome are similarly consistent with the inclusion of Omitted Observation Session 3.

Fig. 2
figure 2

A section by section comparison of average distraction levels per observation session, during passive and active learning activities. Each data point represents a single observation session from the filtered data set (see Data section) and is the average level of distraction by digital devices in that session during only activities coded as passive (left panel) or active (right panel). Only the six sections having been observed at least three times are shown. The error bars represent standard errors of the data points shown

Discussion

Our observational study contributes noteworthy results to the current understanding of digital distraction in the classroom: (a) using mobile devices instead of standalone clickers for polling had no significant impact on the amount of digital and other distractions in our courses. While digital distractions remain a concern, the use of mobile devices did not increase or decrease the amount. Instructors in active-learning courses similar to ours can therefore choose their polling device based on technological features and pedagogical needs without having to worry about increasing the amount of digital distractions.

We have also established that (b) the typical level of distraction in our courses in terms of class time is approximately 25%, with the level of distraction from only digital sources being approximately 20%. While this seems high, it is generally at the lower end of other results from the literature. We found that (c) distractions (digital and other) are particularly low while students are asked to engage in active learning, such as solving worksheet problems in small groups or having peer discussion related to concept questions. The difference between Active and Passive learning activities is significant and in general agreement with Lane and Harris (2015). These results should further encourage instructors to use active-engagement pedagogies as much as possible. The literature discussed in this paper suggests that many instructors already see this as the best way to respond to digital distractions.

Our qualitative observations also found that (d) students generally choose opportune times to engage in off-task behavior: we frequently observed students quickly checking their cell phones or writing text messages when they have just finished an activity. While this behavior causes minimal disruption to the rest of the class, it is still a problem because texting students are not available for peer discussions. It is also likely that such short bursts of digital distractions prevent students from engaging in further reflections and disrupt their ability to make connections between classroom activities and lecture topics. As briefly mentioned above, the literature suggests that there is generally a negative impact of such distractions on learning. We therefore recognize the need to educate our students in the responsible use of mobile technology as suggested by Hodges (2019) and Delello et al. (2020).

Finally, we observe (e) a fairly large variability in average distraction levels per observation session (lecture period) within each section, where the variability from one lecture period to the next within our most observed sections is at least as large as the variability in overall digital distraction levels between sections (see Fig. 2). This variability in distraction levels can often be attributed to external factors. For example, distraction levels were particularly high during midterm exam weeks or during lectures near the end of the term. The range of acceptable distraction levels from an instructor’s point of view is relatively small: anecdotally, our own classes felt overall “engaged” when the average distraction levels were below 20% and “distracted” or “unengaged” when distraction levels climbed over 30%.

Limitations

Our study was performed in the specific context of large first-year courses in physics for undergraduate science students at a large Canadian research university. These courses can be characterized as ‘service courses’ that employ active learning pedagogies using small-group problem solving with worksheets as well as concept questions and peer discussions facilitated by classroom polling. It is not clear how representative our results are for other courses, course topics, or other institutions.

As digital technology rapidly changes, it is important to point out that our data were collected during 2014 and 2015. Our results are thus a snapshot of this time period. McCoy’s survey data showed that the level of distractions and the main modes of distraction (texting, social media) were remarkably similar from 2013 to 2020, while the ownership of smartphones was steadily increasing. Taken together, this suggests that advancements in technology did not have a significant impact in the level and type of distractions. However, it is likely that the COVID-19 pandemic has generally changed the use of technology in class. Our own institution has switched from an almost exclusive use of physical clickers (iClicker) before the pandemic to an almost exclusive use of web-polling software during and after the pandemic (iClicker Cloud). Furthermore, only a small fraction of students used laptops or tablets in class for note-taking or for answering polling questions at the time of our observations. We now see a significant fraction of students using tablets for note-taking and a second device for answering polling questions. It would therefore be beneficial to repeat the study. Lastly, it is important to emphasize that we had no official mobile-device policy implemented in our courses. While this is not a limitation of our study, it is important to mention as the literature suggests that such course details can have an influence on the level of distraction.

Conclusion

In summary, we have performed an observational study of distraction by digital devices in three large first-year physics courses. We found no difference in distraction from digital devices or overall distraction between classrooms using these devices for polling as compared to those using standalone clickers. Instructors can therefore choose a polling technology that best fits their needs without having to worry about increasing distraction.

Our study also provides important data related to the often-expressed recommendation of using active learning as a strategy to decrease distractions in class. We have measured significantly lower levels of distraction when students were actively engaged, compared to when they were in a passive mode, a result consistent with the findings of Lane and Harris (2015). Our data therefore support this recommendation. As compared to other studies, the levels of distraction in our courses are on the lower side of the range due to the frequent use of active learning. Still, students spend a quarter of class time being distracted—most of it by their digital devices—and a further reduction would be desirable. The literature provides us with some insight with regards to the effectiveness and practicality of course policies and points towards using an educational approach rather than a punitive one.