LECTOR introduces a non-invasive multimodal solution, which exploits the potential of ambient intelligence technologies to observe student actions (SENSE), provides a framework to employ activity recognition techniques for identifying whether these actions signify inattentive behavior (THINK) and intervenes –when necessary– by suggesting appropriate methods for recapturing attention (ACT). According to cognitive psychology, the sense-think-act cycle stems from the processing nature of human beings that receive input from the environment (perception), process that information (thinking), and act upon the decision reached (behavior). Such pattern became the base for many design principles regarding autonomous agents and traditional AI.
For that to be optimally achieved, the proposed system is able to make informed decisions using volatile information and reliable knowledge regarding the syllabus covered so far, the nature of the current activity, the “expected” behavior of the involved individuals towards it, the behavior of the peers, etc. The aforementioned pieces of information can be classified under the broader term of Context of Use, defined as follows: “Any information that can be used to characterize the situation of entities (i.e., whether a person, place, or object) that are considered relevant to the interaction between a user and an application, including the user and the application themselves. Context is typically the location, identity, and state of people, groups, and computational and physical objects” . Based on the above, the SENSE-THINK-ACT model of LECTOR relies on an extensible modeling component to collect and expose such classroom-specific information.
This work extends the SENSE-THINK-ACT model by introducing the notion of LEARN (Fig. 1a). The fact that the nature of this system enables continuous observation of student activities creates the foundation for a mechanism that provides updated knowledge to the decision-making components. In more details, the LEARN-ing mechanism is able to (i) assess decisions that resulted in negative outcomes in the past (e.g., inattention levels remain high or deteriorate after introducing a mini-quiz intervention during a math course) and (ii) incorporate knowledge provided by the teacher (e.g., disambiguation of student behavior, rejection of suggested intervention during a specific course, etc.).
4.1 Motivating Scenarios
Monitoring the Attention Levels of an Entire Classroom.
On Monday morning the history teacher, Mr. James, enters the classroom and announces that the topic of the day will be the “Battle of Gaugamela”. During the first 15 min the students pay attention to the teacher who narrates the story; soon enough, the students start losing interest and demonstrate signs of inattentive behavior. In more details, John is browsing through the pages of a different book, Mary and Helen are whispering to each other, Peter stares out the window and Mike struggles to keep his eyes open. When identifying that the entire classroom demonstrates signs of inattention, the system recommends that the lecture should be paused and that a mini quiz game should be started. The teacher finishes up his sentence and decides to accept this intervention. After his confirmation, a set of questions relevant to the current topic is displayed on the classroom board, while their difficulty depends on both the students’ prior knowledge and the studied material so far. During use, the system identifies the topics with the lowest scores and notifies the teacher to explain them more thoroughly. As soon as the intervention ends, Mr. James resumes the lecture. At this point, the students’ attention is reset and they begin to pay attention to the historical facts. As a result, the quiz not only restored their interest, but also resulted in deeper learning.
Monitoring the Attention Levels of an Individual Student.
During the geography class Kate is distracted by a couple of students standing outside the window. The system recognizes that behavior and takes immediate action to attract her interest back on the lecture. To do so, it displays pictures relevant to the current topic on her personal workstation while a discreet nudge attracts her attention. A picture displaying a dolphin with weird colors swimming in the waters of Amazon makes her wondering how it is possible for a dolphin to survive in a river; she patiently waits for the teacher to complete his narration to ask questions about that strange creature. That way, Kate becomes motivated and starts paying attention to the presentation of America’s rivers. At the same time, Nick is drawing random pictures on his notebook and seems to not pay attention to the lecture; however, the system already knows that he concentrates more easily when doodling, and decides not to interpret that behavior as inattention.
4.2 Context of Use
LECTOR’s decision-mechanisms are heavily dependent on contextual information to (i) identify the actual conditions (student status, lecture progress, task at hand, etc.) that prevail in a smart classroom at any given time and (ii) act accordingly. The term context has been used broadly with a variety of meanings for context-aware applications in pervasive computing . The authors in  refer to contexts as any information that can be detected through low-level sensor readings; for instance, in a home environment those reading include the room that the inhabitant is in, the objects that the inhabitant interacts with, whether the inhabitant is currently mobile, the time of the day when an activity is being performed, etc.
However, in a smart classroom contextual awareness goes beyond data collected from sensors. Despite the fact that sensorial readings are important for recognizing student activities, they are inadequate to signify inattention without information regarding the nature of the current course, the task at hand, the characteristics of the learner, etc. This work employs the term Physical Context (PC) to indicate data collected from sensors, while the term Virtual Learning Context (VLC) is used for any static or dynamic information regarding the learning process (e.g., student profile, course related information, etc.) .
The exploitation of such contextual information can improve the performance of the THINK component, which employs activity recognition strategies in order to identify student activities and classify them as inattentive or not. Despite the fact that activity recognition mainly relies on sensor readings to detect student activities, the Virtual Learning Context (VLC) is critical to interpret inattention indicators correctly; as an example, in general excess noise indicates that students talk to each other instead of listening to the teacher; however, this is irrelevant during the music class.
Furthermore, VLC is essential for the ACT component; when the system decides to intervene in order to reset students’ attention, the selection of the appropriate intervention type depends heavily on the context of use (syllabus covered so far, remaining time, etc.). As an example, if an intervention occurs during the first ten minutes of a lecture, where the main topic has not been thoroughly analyzed by the teacher yet, the system starts a short preview that briefly introduces the lecture’s main points using entertaining communication channels (e.g., multimedia content).
4.3 Sensorial Data
LECTOR is deployed in a “smart classroom” that incorporates infrastructure able to monitor the learners’ actions and provide the necessary input to the decision-making components for estimating their attention levels. To ensure scalability, this work is not bound to certain technological solutions; it embraces the fundamental concept of Ambient Intelligence that expects environments to be dynamically formed as devices constantly change their availability. As a consequence, a key requirement is to ensure that new sensors and applications can be seamlessly integrated (i.e., extensibility). In order to do so, LECTOR relies on the AmI-Solertis framework, which provides the necessary functionality for the intercommunication and interoperability of heterogeneous services hosted in the smart classroom.
As regards the supported input sources, they range from simple converters (or even chains of converters) that measure physical quantities and convert them to signals, which can be read by electronic instruments, to software components (e.g., a single module, an application, a suite of applications, etc.) that monitor human computer interaction and data exchange. However, a closer look at the sensorial data reveals that it is not the actual value that matters, but rather the meaning of that value. For instance, the attention recognition mechanism does not need to know that a student has turned his head 23° towards south but that he stares out of the window.
Subsequently, LECTOR equips the developers with an authoring tool that enables them to provide the algorithms that translate the raw data into meaningful high-level objects. In more details, through an intuitive wizard (Fig. 2) the developers (i) define the contextual properties (e.g., Speech, Feelings, Posture, etc.) that will be monitored by the system, (ii) specify the attributes of those properties (e.g., level, rate, duration, etc.) and (iii) develop the code that translates the actual values coming directly from the sensors/applications to those attributes. The in-vitro environment where LECTOR is deployed employs the following ambient facilities:
Eye-trackers to observe students’ fixations during studying on a personal computer (e.g., reading a passage, solving an exercise) to determine the attention level (e.g., stares at an insignificant area of the screen), the weaknesses (e.g., the student keeps reading the same sentence over and over again), the interests (e.g., fascinated with wild life) and the learning styles (e.g., attempts the easier assignments first) of each student. The same information can be also provided by custom educational software (i.e., CognitOS).
Sophisticated cameras (e.g., RGB-D camera such as Microsoft Kinect) that track the head pose of the learner and are used as a surrogate for gaze. The combination of eye-tracking and head pose tracking algorithms offers an accurate overview of what the students are looking at on the computer screen and on whom or what they are focused on (e.g., teacher, class board, etc.). Moreover, the use of cameras is ideal for tracking the body posture and the direction of an individual student, especially when taking into consideration that they constantly move even while seated. Besides learners’ orientation, camera input also enables the identification of specific gestures that indicate whether a student is paying attention to the lecture or not (e.g., a student raising his hand). Finally, they can be used to analyze whether the students’ capabilities are compromised due to feelings of fatigue (i.e., Drowsiness, Falling Asleep).
Microphones are placed on the teacher’s and students’ desks to identify who is speaking at any time and the overall noise levels of the classroom, which can reliably indicate inattentive behavior on behalf of the students.
Pressure-sensitive sensors on each learner’s chair to identify whether the student is seated or not. This information when combined with data received from strategically placed distance and motion sensors (e.g., near the class board, near the teacher’s desk), introduces a primitive localization technique that can be used to estimate the location and the purpose of a “missing” individual (e.g., a student is off the desk but near the board thus solving an exercise).
Wearable sensors that can be used to monitor the students’ physiological signals (e.g., heart rate, EDA, etc.).
LECTOR currently uses the aforementioned ambient facilities to monitor some physical characteristics of the students and teachers and translates them, in a context-dependent manner, into specific activities classified under the following categories: Focus, Speech, Location, Posture and Feelings, which are considered appropriate cues that might signify inattention [2, 11, 19, 25].
4.4 Inattention Alarms
LECTOR’s THINK component (Fig. 3) is responsible for identifying the students who show signs of inattention. Towards such objective, it constantly monitors their actions in order to detect (sub-) activities that imply distraction and loss of attention. The decision logic that dictates which behaviors signify inattention is expressed via high-level rules in the “Attention rule set”, which combines various contextual parameters to define the conditions under which a student is considered distracted. There are two type of rules in the “Attention rule set”: (i) rules that denote human activities or sub-activities (e.g., talking, walking, sitting, etc.) and provide input to (ii) rules that signify inattentive behaviors (e.g., disturb, chat, cheat, etc.). Through an educator-friendly authoring tool, namely LECTORstudio , the teachers have the opportunity to create or modify the latter, while -due to their complexity- they can only fine-tune the rules that denote human (sub-) activities.
Whenever a stimulus is detected by the SENSE component, the THINK component initiates an exploratory process to determine whether the incoming event indicates that the student(s) has lost interest in the learning process or not. In order to do so, it employs the appropriate attention recognition strategies based on the “Attention rule set”. Finally, at the end of the exploratory process, if the result points to inattentive behavior, SENSE appropriately informs the ACT component which undertakes to restore student engagement by selecting an appropriate intervention.
Figure 4 presents the graphical representation of a rule describing the activity “SHOUTING”, as created in LECTORstudio. Specifically, the purpose of this rule is to create an exception for the Music course, where students sing, thus raising the noise levels of the classroom higher than usual; in that case, the activity “SHOUTING” should be identified when the sound volume captured through the class microphone exceeds the value of 82 dB.
4.5 Intervention Rules
As soon as inattentive behavior is detected, the ACT component (Fig. 5) initiates an exploratory process to identify the most appropriate course of action. Evidently, selecting a suitable intervention and its proper presentation (appropriate for the device where it will be delivered) is not a straightforward process, as it requires in-depth analysis of both the learners’ profile and the contextual information regarding the current course. The first step is to consult the “Intervention rule set”, which, similarly to the “Attention rule set”, is comprised of high-level rules describing the conditions under which each intervention should be selected (e.g., if all students are distracted during the math course, recommend an interactive task like a mini-quiz) as well as the appropriate means of presentation (e.g., if a mini-quiz is selected and the intervention is intended for all students, display it to the classroom interactive board).
Each intervention rule, upon evaluation, points to a certain intervention strategy into the “Interventions’ Pool” (IP). The IP includes high-level descriptions of the available strategies, along with their low-level implementation descriptions. Furthermore, since inattention can originate either from a single student or the entire classroom, the ACT component should be able to evaluate and select strategies targeting either an individual student or a group of students (even the entire class). To this end, the “Interventions’ Pool” should contain interventions of both types, and the decision logic should be able to select the most appropriate one. After selecting the appropriate intervention, the system personalizes its content to the targeted student and converts it to a form suitable for the intended presentation device.
LECTORstudio also permits the teachers to tailor the intervention mechanism to the needs of their course by modifying the “Intervention Rule Set”. In more details, a teacher can create custom interventions, customize existing ones in terms of their content, change the conditions under which an intervention is initiated (e.g., the percentage of distracted students), etc.
4.6 Intervention Assessment
Both the THINK and ACT components are able to “learn” from previous poor decisions and refine their logic, while they are open to expert suggestions that can override their defaults. In order to introduce the notion of LEARN, LECTOR provides mechanisms that modify the decision-making processes by correlating knowledge gathered through attention monitoring with student performance and expert input.
To this end, the LEARN component is able to assess the regression of students’ attention lapses -through the respective student profile component- with a formerly applied intervention to identify whether it had positive results or it failed to reset attention. In more details, if the system estimates that a particular intervention will reset attention in the context of a specific course and applies it, then after a reasonable amount of time it re-calculates the current attention levels; if it still detects that the students are not committed to the learning process, then the selected recommendation is marked as ineffective in that context. Hence, the ACT component is informed so as to modify its decision logic accordingly, and from that point forward select different interventions for that particular course instead of the one that was proven to be unsuccessful.
On top of the automatic application of active learning interventions, the system also permits additions, modifications, cancellations and ranking of the selected interventions. This allows the teacher to have the final say regarding the lecture format. To this end, the LEARN component takes into consideration the teacher’s input and appropriately inform the ACT component so as to refine the intervention rule set and offer more effective alternatives when necessary. In more details, the teacher should be able to: (i) change the recommended intervention with a more appropriate one (e.g., quiz, multimedia presentation, discussion, etc.), (ii) rank the recommendation and (iii) abort the intervention in case it disrupts the flow of the course.