Keywords

1 Introduction

Keeping students engaged in task related behavior is a challenge shared by Adaptive Instructional Systems (AISs) and assessment systems. Learner models are representations of students’ cognitive, metacognitive, affective, personality, social and perceptual skills that AISs have maintained and used to adapt the system’s interactions (e.g., type and amount of feedback, content presented, interaction type, question difficulty, and access to additional content and related materials) to the characteristics of the student [1]. A variety of techniques have been used to implement learner models in AISs (e.g., probabilistic models, cognitive models, machine learning models, constraint-based models, agent-based models, stereotype models, and overlay models) [1,2,3].

We describe issues related to learner modeling in the context of caring assessments that take into account a broad view of the learner as well as information about the learning context to create engaging assessment situations that can be used to collect valid assessment information about what the student knows or can do [4]. Work in this area includes exploring student interaction with technology-rich, conversation-based assessments that make use of traditional item types and dynamic conversations with artificial agents to assess skills such as science inquiry, collaboration, and argumentation [5]; identifying and dealing with unexpected responses [6]; exploring emotions [7]; evaluating the impact of source credibility and question format on the quality of student responses [8]; examining student-level individual differences relevant to assessment performance that could be incorporated into an expanded conception of a learner model [9]; and opportunities for adaptive interactions [10].

Topics discussed in this paper include: (a) what types of background variables, individual differences, cognitive and affective variables have potential to be included in the learner model; (b) what types of adaptations can be implemented to support student engagement while keeping in mind considerations of fairness, validity and reliability of the assessment system; (c) how cognitive and affective information can be used to improve adaptive processes (e.g., recommendations for next activities and feedback); (d) considerations for reporting learning model information to particular stakeholders (e.g. open learner models; [11]); (e) what types of supports should be available to ensure that students, especially those from underserved groups, can access the assessment environment and consider it an appropriate way to demonstrate their knowledge and skills [12, 13]; and (f) data privacy and data security challenges regarding the procurement and handling of learner model information required to implement this approach. Implications for future research in the area of learner modeling for caring assessments, and for AISs in general, will be discussed.

2 Learner Modeling in Caring Assessments

Assessment methodologies such as Evidence-Centered Design (ECD) [14] are used to design assessments that rely on assessment arguments showing how tasks are designed to provide the evidence needed to measure the intended construct (i.e., knowledge, skills, and abilities being measured). The assessment design structure (usually represented as the student, evidence and task models) can be thought as components of a learner model [15].

Caring assessments broaden the scope of the learner model to include aspects such as students’ motivation, metacognition, and affect; consider information about the learning context; and support different levels of caring (see Fig. 1).

Fig. 1.
figure 1

Caring assessments and two example use cases. © Educational Testing Service 2020. All Rights Reserved.

Two possible use cases include (a) providing an unmotivated student with information aimed at making the task relevant to the student (e.g., how the data will be used) [16], offering other alternatives to demonstrate his/her knowledge [17], or considering the student’s motivational state when making inferences about the student’s knowledge, skills or ability levels [18]; and (b) offering just-in-time instructional materials (e.g., on-line content or AISs, if available) to students who have not previously had the opportunity to learn about a particular topic or taking into account the student’s knowledge level/exposure level when making inferences. In general, caring assessment can keep track of information about the learner and the learning context to improve the student’s assessment experience without negatively affecting the technical properties of the assessment.

Several challenges to the realization of this vision include gathering information about the learner and the learning context; improving communication channels among teachers, parents and students and assessment providers; and improving processes for collecting, maintaining, sharing and protecting learner model information (e.g., who controls what type of information?) [4].

An approach to identifying possible assessment design, student motivation issues, and the types of variables that should be monitored and kept as part of the learner model involves exploring the presence of unexpected responses in assessments that require students to explain or defend their ideas via open-ended response items. We have explored unexpected responses in the context of conversation-based assessments [5]. In this approach, we reviewed existing log file data [6, 19] to identify unexpected responses, created a list of possible categories, designed possible solutions or ways to deal with these situations, and gathered feedback from expert teachers. Categories explored include the following:

  • Confused: the response makes little sense or lacks an explanation.

  • Frustrated: The student seems annoyed by his or her interaction with the characters (e.g., “I already said that”).

  • Repetitive: The student repeats the same answers to slightly different questions across multiple turns of the conversation.

  • Unmotivated: The student does not care to answer or provides short answers due to lack of motivation (e.g., “no,” “idk,” “sure,” “yep”).

  • Irrelevant: The student does not answer the right question, or the answer is off topic.

  • Asking for help: The student asks characters for additional information (e.g., please repeat the question or show additional materials).

  • Gaming the system: The student tries to test the capabilities of the system (e.g., by entering profanity).

  • Attempting to communicate: The student does not answer the question (e.g., “I answered a similar question before”).

  • Using different languages: The student answers the question in a language other than English.

  • Giving up: The student expresses a wish to quit.

Results of this work suggest that detectors can be developed to deal with many of these issues. Approaches to deal with these issues can be used to inform the development of caring assessment systems that can take in a variety of inputs from the user and respond appropriately. How the system may react depends on several aspects including the purpose of the assessment, information about the learning context, resources available and learner characteristics.

The proposed unexpected response categories suggest that students’ emotional states should be incorporated as part of the learner model, which has been done in AISs (see [20] for a review). The following section presents work on identifying the types of emotions students experience as part of an assessment.

3 Exploring Emotions in Caring Assessments

The impact of a variety of emotions on learning outcomes in AISs has been investigated for over 20 years (see [21] for a review) and during the last ten years has focused on how student emotions can be augmented during learning to improve outcomes [20]. However, research on the role of emotions during assessments has been more limited. Originally the focus of this area of research was limited to the impact of test anxiety [22], but a wider range of emotions was identified as impacting assessment outcomes [23]. Research on how to augment student emotions during assessments has been very limited (see [16] for an exception).

Recent research has established that efforts to provide emotion-sensitive support during assessments should move beyond targeting only test anxiety [7, 23], but exactly what states should be targeted is still an open question. For example, research on both the achievement emotions [24] and learning-centered emotions [21] promote the identification and tracking of individual emotions and have been explored in assessments, but the two sets of emotions differ. Recent research has also suggested that emotion intensity should be considered along with the nature of the individual emotion detected [7]. It is also possible that the identification of individual emotions is not necessary for adaptations and that in order to provide effective interventions, assessment systems simply need to track states that deviate from an engaged learning experience [16].

Despite the difficulties of determining what states to detect and the technological difficulties of detecting those states, there is evidence that emotion-sensitive support could be helpful during the completion of assessments [7, 16, 17]. However, this leads to the next open question of how best to provide support during assessments. Currently, the only emotion-sensitive support that is adaptively deployed during assessments occurs in effort-monitoring computer-based tests [16]. This support is intended to increase test-taking motivation by providing on-screen messages that highlight the importance of the low-stakes test to either the student or their institution. These on-screen messages have proven effective in terms of reducing unmotivated responses and increasing test reliability. Additional research is needed to understand how these messages can be designed to further facilitate emotion regulation.

Emotion-sensitive support during assessments could potentially go beyond on-screen messages. Alterations to the assessment task (e.g., framing, format) are another potential method to facilitate emotion regulation. Task framing modifications could be used to increase engagement by selecting the task context based on student interests (e.g., basketball game vs. painting) [25]. Tasks could also be modified by adaptively presenting them in different formats (e.g., game-based assessment, multiple-choice items) to provide each student with the best opportunity to show their knowledge, skills, and abilities [17].

Research on emotion-sensitive support during assessments is still in its infancy and thus there are many potential issues that must be addressed before this type of adaptivity can be incorporated into assessment systems. As mentioned previously, reliable emotion detection is still an ongoing area of research [26]. It is critical that assessment systems are confident in the emotion or state that has been detected before deploying support; otherwise, the support deployed could be distracting or negatively impact engagement with the assessment. Another issue related to the effectiveness of emotional interventions is the degree to which they should be tailored to the individual student [27]. To deploy the most effective interventions, it may be necessary to also consider student characteristics, as described in the next section. Lastly, while the deployment of emotion-sensitive support during assessments has the goal of creating a more egalitarian opportunity for all students and resulting in a more valid assessment outcome, research is needed to ensure there are no unintended negative side effects on the fairness and validity of the assessment. Despite the need for further research, the incorporation of student emotions into learner models for assessments holds great promise for allowing assessments to care.

4 Leveraging Individual Differences in Caring Assessments

In addition to their attention to momentary affective states, caring assessments have the potential to consider a broader set of student-level characteristics as learner model variables. Metacognitive or motivational characteristics have been considered part of the space of student-level variables that are routinely tracked and used for decision-making and adaptation in the domain of intelligent tutoring systems [28, 29]. These include such variables as learner effort or degree of persistence, self-reported confidence, help seeking, which may be mediated by the domain or the task design, or relatively stable characteristics such as Big-Five personality constructs like neuroticism and openness to experience. A system that tracks these learner characteristics has the potential to deliver more fine-grained, precise adaptations to tasks and situations that can take these characteristics into account [29].

Attention to student-level characteristics during assessments has often been considered from a summative perspective; consider research examining student-level demographic or contextual variables that meaningfully relate to assessment performance, exemplified by reports using data from the National Assessment of Educational Progress (NAEP; [30, 31]). Based on these analyses, we know that student demographics, home environment, and exposure to high-quality instructional practices are key factors that contribute to student performance on achievement tests [31]. Thus, for example, a caring assessment could take into account whether or not students have been exposed to relevant instructional strategies (e.g., doing hands-on activities, working in small groups) by providing additional instructional materials or alternative resources for the student to engage with before they are ready to tackle the expected challenge of the assessment (see Fig. 1).

Beyond these variables for which there is empirical evidence demonstrating links to achievement test scores, there has been increasing attention to other student-level characteristics that can impact academic outcomes, but which have been little studied in the realm of assessment – characteristics such as grit [32], growth mindset [33], and self-efficacy beliefs [34, 35], to name a few. Research with middle school students has found that these student-level individual differences (in addition to cognitive flexibility) as measured by student self-report significantly predicted performance on an interactive conversation-based assessment of science inquiry skills [9]. Thus, a caring assessment could incorporate such “non-cognitive” characteristics as elements within a learner model that could be used to adapt an assessment task. For example, for students identified as having a fixed mindset, messages could be implemented at the beginning of the assessment to encourage a growth mindset prior to the task (see [36] for a review of potential interventions).

In addition to these individualized interventions based on levels of particular variables, multiple measures could be meaningfully combined into user profiles, which could streamline interventions for subgroups of students. Using such student-level variables as inputs to a hierarchical cluster analysis, we were able to identify four distinct subgroups of students with similar profiles [37]; these subgroup designations could provide a means to target certain student subgroups with a particular intervention, without the need to develop fully individualized interventions tuned to all the known characteristics of a specific individual. We are currently examining whether these same profiles or patterns are apparent in the context of an interactive conversation-based assessment of mathematical argumentation skills. As with the affective interventions described previously, additional research is required to investigate the impact of proposed interventions, whether targeted toward specific characteristics or combinations thereof, in order to determine whether the desired impact on performance is obtained, and whether unintended consequences can be minimized. Ultimately, the aim is to detect (combinations of) student characteristics, and to deliver to each student a tailored assessment version that maximizes engagement and opportunity to perform to the best of their ability [4, 10, 27].

5 Discussion

In this section we discuss how our work on caring assessments informs learner modeling research and future work.

  • Learner model variables. Cognitive, social and emotional aspects of the learner can influence levels of engagement and performance on assessments. Regarding the types of variables that should form the learner model, research shows that malleable variables that when supported have potential to improve student learning (e.g., cognitive abilities, metacognitive skills, affective states) may be good candidates to include [1, 9]. However, additional information about learners and the learning context can prove valuable in dealing with the cold start problem and making appropriate instructional recommendations [38].

  • Types of adaptations. The types of adaptations that are appropriate for assessment systems include making changes to the graphical interface for accessibility purposes [12], providing adaptive feedback and sequencing of tasks, engaging students in conversations with artificial agents perhaps with different characteristics or knowledge levels [8, 9], granting additional time and opportunities to make revisions, and recommending additional activities or learning materials. Some types of adaptations may be more appropriate than others depending on the purpose of the assessment. Adaptations should consider fairness, validity and reliability aspects of the assessment [27].

  • Making recommendations based on cognitive and motivational aspects of the learner model. AISs can make recommendations based on both cognitive and motivational aspects of the learner. For example, additional activities or support messages can be triggered based on motivation and knowledge levels (e.g., assigning additional tasks to motivated students and providing unmotivated students with supporting messages). By monitoring both cognitive and emotional aspects of the learner it is possible to keep students engaged for longer periods of time.

  • Reporting systems and open learner models. Learner model information can be used to support students’, teachers’, and administrators’ decision making [11, 39, 40]. Work on designing and evaluating reporting systems for innovative assessments highlights the importance of (1) following an iterative, audience-centered approach, (2) using student response and process data to support stakeholder’s decisions, and (3) evaluating both comprehension and preference aspects of the visual representations [39, 40]. Also, principles for effective visualizations from areas such as cognitive science and information visualizations can inform the development of effective reporting systems and open learner modeling interfaces [41].

  • Supporting student access to AISs. Accessibility features [12], as well as strategies to support student interaction with AISs can be informed by the information maintained in the learner model. Adaptations to the delivery system (e.g., additional supporting messages) as well as to administration conditions (e.g., providing additional time) can be put in place to support cognitive bandwidth recovery strategies so students can be in better conditions to demonstrate what they know or can do [13].

  • Data privacy and data security. Learner model information should be protected, and users should have control over what information is being kept and how it is used [42]. Appropriate mechanisms should be in place to ensure the safety of learner model information.

6 Future Work

Future work in this area includes continued exploration of the potential of caring assessment to improve the current state of assessments. This work involves the design, implementation and evaluation of adaptive features in digital assessments. Results will inform both the development of new caring assessments and AISs.