1 Introduction

Teachers in primary and secondary schools usually have to face and handle students’ problem behaviors. Student problem behavior has been since decades a research topic with the aim how to help students in their undesirable conducts and actions (Jessor 2016). Students’ problems cause concerns in schools and require help and guidance from teachers. Today, for example, Internet addiction and school bullying can be regarded as the typical problem behaviors (Şaşmaz et al. 2014; Dake et al. 2003). Such problem behaviors are obviously harmful to students’ own learning and development and to a school community. In practice, many teachers have accumulated rich experience in teaching subjects (e.g., math or biology), but they often lack experience in identifying and diagnosing the student problem behaviors. Some teachers may seek help by reading books, randomly searching online, or asking peers’ experiences. However, such methods may not be quite effective and easily suffer from the subjective and biased experiences. In addition, it requires collecting the student’s information from multiple dimensions, where the questionnaire survey, interview, and literature analysis might be used as well. Hence, it is still critical and challenging for teachers to tackle the students’ problem behavior issues in real situations.

In this chapter, we present how artificial intelligence (AI) technologies can be employed to help teacher diagnose students’ problem behaviors. Specifically, the task-oriented dialogue system technology is utilized to develop an AI-powered assistant for problem behavior diagnosis. The task-oriented dialogue systems have been widely adopted in many other fields, typically including ticket booking (Li et al. 2017), restaurant searching (Wen et al. 2016), and online shopping (Yan et al. 2017). Furthermore, the dialogue system has been used for automatic diagnosis of disease in medical field as well. Through multi-turn dialogue, the system can acquire symptoms from patients and automatically diagnose their diseases, which greatly improves accessibility of medical service (Wei et al. 2018; Peng et al. 2018; Kao et al. 2018).

Inspired by the wide usage of task-oriented dialogue system in other fields, we design and develop a task-oriented dialogue system for automatic identification of students’ need deficiencies and targets helping teachers to handle the student problem behaviors. Maslow (1943) states that people’s behaviors are driven by their psychological needs, and thus the problem behaviors are often caused by the unfulfilled psychological needs, which are termed as need deficiencies. The students’ problem behaviors thus can be handled by identifying their need deficiencies (Harper et al. 2003), timely diagnosing the reasons behind, and conducting necessary interventions. Specifically, the system design is based on a theoretical framework that summarizes the relevant psychology finding for student need deficiency, and utilizes the natural language processing techniques to enable the natural communication between teachers and the system.

The rest of this chapter is organized as follows. Section 2 describes the theoretical framework for the proposed teacher assistant, followed by the system design presented at Sect. 3. Finally, Sect. 4 discusses the impact of proposed AI-powered teacher assistant and concludes this chapter.

2 Theoretical Framework for System Design

Studies have been conducted to analyze the causes underlying students’ problem behaviors. According to the classical theory of Maslow (1943), people’s behaviors are driven by psychological needs, which implies need deficiencies are the reasons for problem behaviors. Jessor (2014) finds that students’ behaviors are influenced by the interactions between students’ personality systems and their perceived environment systems. Harper and Stone (2003) shows that the students’ psychological needs can be affected by different factors like natural disasters, violence, abuse, poverty, lack of school and community resources, and emotional deprivation. Dennis et al. (2005) find that the interaction between individual characteristics and environmental factors influences student development. Those research findings are informative and useful but are too scattered for systematic applications. Hence, a theoretical framework summarizing all the relevant factors is necessary, and the designed system explicitly considers difference classes of need deficiencies, problem behaviors, external environmental factors, as well as individual factors.

2.1 Need Deficiency

According to Maslow’s theory (Maslow 1943), student’s problem behaviors are driven by the unmet psychological needs. Hence, we define and classify student’s need deficiency into five categories: physiological needs, safety needs, belongingness and love needs, esteem needs, and cognitive needs. In our framework, we replace the self-realization need in Maslow’s original hierarchy of needs with the cognitive need. The self-realization needs mainly denotes fusing goodness and beauty, which are often demanded in the later stages of life and not appropriate for K-12 students. The list of the classification of student basic needs is summarized in Table 1.

Table 1 Classification of student basic needs

2.2 Problem Behavior

For identifying students’ problem behavior, we applied Achenbach and Rescorla’s (2014) Child Behavior Checklist (CBCL). It can be used for analysis of children’s behavioral and emotional problems between 1.5 and 18 years old. It uses empirical, multiaxis, and cross-assessor measurement methods to identify students’ problem behaviors. Specifically, three types of forms were designed: the Teacher Report Form, the Youth Self-Reports, and the Direct Forms. Reliability and validity of these forms has been verified through a series of cross-cultural studies. Our framework categorizes problem behavior with slight modifications learned from real-life case analysis.

In our study, problem behaviors are classified into three categories: externalization problems, internalization problems, and other problems. Externalization problems denote the “externalization syndrome” of behaviors, and mainly refer to social adaptation problems, including attack, bullying, sabotage, and so on. It is further divided into aggressive behaviors and rule-breaking behaviors. Internalization problems denote the “internalization syndrome” of behaviors, and refer to emotional distress problems or nonsocial behavioral problems, including anxiety, depression, and so on. It is further divided into social withdrawal, depression, and anxiety. Problems that do not belong to these two categories are defined as “other problems,” which include learning problems, egocentricity, and special problems. The list of the classification of student problem behavior is given in Table 2.

Table 2 Classification of student problem behavior

2.3 External Environmental Factors

External environmental factors mainly refer to factors that affect students’ growth and therefore significantly affect the formation of problem behavior. Various studies have also been conducted to explore how different factors affect students’ problem behaviors. For example, Hoffmann (2006) finds that changes in parents’ marital status increases the probability of adolescents engaging in problem behaviors. Fomby and Christie (2013) discovers that living in unstable families can lead to more aggressive and antisocial behaviors in these adolescents. Pinquart (2017) shows that students whose parents adopt authoritarian, permissive, and neglectful parenting styles have a high probability of externalizing problems. Maryam et al. (2019) shows that students who are rejected by peer groups tend to develop more internalizing problems.

Based on these findings, we summarized and classified the external environmental factors into three main categories, namely, family factors, school factors, and society factors. A comprehensive and in-depth exploration of the family factors affecting problem behavior can be further divided into the following categories: family structure, parenting style, education background, health condition, delinquent behavior, and socioeconomic status. The school factors are further divided as teacher leadership style, peer acceptance, and peer influence. According to the theory of social learning, the society factors are further divided as social media and cultural customs. The list of the classification of the external environment factors is summarized in Table 3.

Table 3 Classification of external environment factors

2.4 Individual Factors

Problem behaviors are also influenced by the physical and psychological factors of the individual. Ehrler et al. (1999) find that the personality characteristics of individuals are significantly correlated with a student’s problem behaviors, and Van et al. (2013) also show that students with extreme scores of Big Five personality (Five-Factor Model, FFM) are prone to problem behaviors. The five factors include neuroticism, extroversion, openness, agreeableness, and conscientiousness. Hence, we define students’ personalities with the Five-Factor Model of Personality (McCrae and Costa 1991). In addition, we consider some basic information and demographic variables related to student problem behavior, including grade, gender, health condition, and social group. The list of the individual factors is given in Table 4. Note that in practice, not all of the factors are required to collect from the students.

Table 4 Classification of individual factors

3 System Design

Our dialogue support system consists of three main modules, namely, diagnosis module, question answering module, and case search module. We will elaborate them in this section, respectively.

3.1 Diagnosis Module

This module adopts the technology of task-oriented dialogue system to conduct diagnosis. The task-oriented dialogue system is designed to complete a specific task through natural language interaction with users (Gao et al. 2019). Various dialogue systems have been designed for different tasks in the literature. Some systems are designed for booking tasks. For example, Li et al. (2017) developed a dialogue system for movie-ticket booking. Wen et al. (2016) built a dialogue system to help users search and reserve restaurants. Dialogue systems can also solve information-searching tasks. For instance, Papangelis et al. (2018) designed a spoken dialogue system to help users make informed decisions through information navigation. Another group of tasks is the automatic diagnosis of medical disease. Tang et al. (2016) designed a group of anatomical models emulating different experts in hospitals to diagnose diseases. We have also done some preliminary studies on employing dialogue system to analyze the causes underlying students’ problem behaviors (Chen et al. 2020; Chen et al. 2021). Through those dialogue systems, service accessibility can be significantly improved.

To conduct diagnosis, this module acquires the necessary information of a specific student through multi-turn dialogue with the teacher, and then automatically diagnoses the student’s need deficiencies behind his or her problem behaviors. The diagnosis process considers both the external environmental factors and individual factors. As shown in Fig. 1, it consists of four main functional components: natural language understanding, dialogue state tracking, dialogue policy learning, and natural language generation.

Fig. 1
figure 1

Diagnosis module for analyzing student problem behavior

The natural language understanding component interprets the teacher’s utterance to extract the intent as well as task-related semantic information. Specifically, it processes a teacher’s reply to extract the student’s information, such as whether he has aggressive behaviors. In this teacher’s assistant, the long short-term memory (LSTM) (Hochreiter and Schmidhuber 1997) network is adopted to interpret the teacher’s utterances. An LSTM network is a typical recurrent neural network that has been widely used in natural language processing recently. Relying on a gating mechanism, it can solve the long-term dependency issue in the sequential data processing.

The dialogue state tracking component tracks the dialogue state that represents all of the task-related information captured. This dialogue state represents students’ information acquired to that point and is utilized to determine the next system action. Specifically, this module updates the dialogue state with another LSTM network based on the output of natural language understanding component.

The dialogue policy learning module takes charge of making decisions on the next system action based on the current dialogue state, such as requesting information or informing certain results. Based on the current dialogue state, we adopt a reinforcement learning model, specifically a deep Q-learning network (DQN) model (Mnih et al. 2015), to learn the dialogue policy that decides whether to request more information from the teacher or to present the derived need deficiency to the teacher. As one of the three main paradigms of machine learning, reinforcement learning targets solving sequential decision-making problems. Recently, deep learning techniques have been integrated into reinforcement learning models to improve model performance. The DQN is a typical deep reinforcement learning model that utilizes a deep neural network to calculate the Q-value in the model. Finally, the natural language generation component utilizes a template-based model to transform system action into text response.

Figure 2 demonstrates a toy example of how the module acquires the student information through a multi-turn dialogue and diagnoses the need deficiency. In short, through multi-dialogue interaction, the module can effectively acquire the students’ information, automatically analyze their need deficiencies, and adaptively generate the advice for teachers.

Fig. 2
figure 2

A toy example of how diagnosis module works

3.2 Question Answering Module

Unlike the diagnosis module that targets on analyzing the problem behaviors for the specific student, this module aims to provide general guidelines on typical problem behaviors through answering questions like “What are the typical problem behaviors for high school girls?” The community question answering (CQA) technology is employed to answer such questions. CQA is a web-based service to help people seek information by answering their questions based on knowledge shared by others in the community (Srba and Bielikova 2016). Quora and Stack Overflow are two typical examples of CQA systems. The main idea of CQA is to utilize knowledge shared by the domain experts in the community discussion, and it is usually built based on data collected from the professional online forums and platforms. Our CQA system is built with the historical questions and answers collected from a nationwide online platform in China (http://haolaoshi.bnu.edu.cn/).

CQA system aims to pick out the most appropriate answer from multiple answers of the given question, and typically includes two main tasks: finding the similar questions and finding the relevant answers (Joty et al. 2018). Traditional approach focuses on the syntactic analysis on the text of questions and answers. For example, Cui et al. (2005) proposed a general tree-based method calculating tree-edit distance to match question and answer. Recently, with the development of deep learning, various deep neural network models have been proposed. For example, Zhou et al. (2018) proposes a recurrent convolutional neural network (RCNN) to capture both the semantic matching between question and answer and the semantic correlations embedded in the sequence of answers. Hence, we are inspired to develop our CQA model with deep learning algorithms.

The structure of the designed CQA model is illustrated in Fig. 3. Specifically, the model provides a two-phase processing. The first one is the question selection phase aiming to find the candidate questions similar to the incoming question. The second one is the answer selection phase which ranks all the answers of the candidate questions generated by phase I, and then selects the most appropriate answer as output.

Fig. 3
figure 3

The CQA model used in question answering module

The first phase identifies the candidate questions similar to the incoming question from the existing ones. We used the pretrained BERT (Devlin et al. 2018) model for natural language processing to analyze the semantics of questions and answers. It first learns the semantic vectors of the existing questions, and creates a database for all the question semantic vectors. Whenever a new incoming question arrives, the same BERT framework is adopted to learn its semantic vector. Subsequently, the model is fine-tuned by a multilayer perceptron (MLP) network to compute the similarity between incoming question and each existing question. Accordingly, it computes a similarity value for each existing question. With a predefined similarity threshold value, a set of similar questions are selected as candidates.

The second phase then starts to identify the most appropriate answer. Firstly, a set of candidate answers is generated based on the best answer of each candidate question in the first phase. Secondly, the semantic vector of each candidate answer is learned using the BERT framework like the first phase. Thirdly, by concatenating the question vector and answer vector, an MLP network is employed to fine-tune the model to compute the matching level between a question and an answer. Finally, the candidate questions are ranked according to the multiplication of question similarity and answer matching level, and the one with the biggest calculated value is chosen as the final output.

3.3 Case Search Module

This module is an independent service that helps teachers to search the similar cases containing successful experiences in diagnosing and intervening student’s problem behaviors. Searching is mainly based on teachers’ text description on student problem behaviors, and the similarity refers to the various aspects of problem behaviors between cases and the teacher’s description. Compared to the simple answers given by the question answering module, the returned cases contain more details, not only including student’s specific behaviors, but also including other relevant information like personal particulars and family background information. More importantly, the cases also contain experts’ analysis on the student’s behavior and the reason behind it, as well as providing different educational strategies and interventions applied. All these details can supply the fine-grained guidelines and advice for teachers to handle similar problem behaviors.

This module is developed with the technology of information retrieval. As a typical natural language processing task, information retrieval aims to find the closely related information according to user requirements. It explores how to represent, store, organize, and access information properly for information searching (Chowdhury 2010).

Various models have been proposed to conduct information retrieval. This module utilizes a deep natural language processing model to compute the similarity between teacher’s text description and case documents. Unlike the semantic similarity calculation in question answering module targeting on computing similarity between two sentences, this case engine computes the similarity between two different documents in the form of a sequence of sentences. As illustrated in Fig. 4, a hierarchical BERT model is designed and implemented to compute the semantic similarity between teacher’s text description and each case document.

Fig. 4
figure 4

The hierarchical BERT model used in case search module

In this mode, the bottom layer mainly learns the semantic vector of each sentence in teachers’ text description and case documents. Specifically, parameters of pretrained BERT model are adopted directly for this bottom layer BERT. The top layer targets on learning the semantic similarity between teacher’s text description and each case document. By taking the semantic vectors of sentences generated with bottom BERT layer as input, we add in the special token “[CLS]” at the beginning and “[SEP]” in the middle to concatenate the two sequence into one sequence. Subsequently, the model can process it like a normal sequence, and generate a semantic similarity vector at the beginning position. After generating the semantic similarity vector, one MLP network model is employed to compute the similarity between the teacher text description and the case document. Similar to the question answering module, all cases are ranked according to the computed semantic similarity and then return back to the teacher.

4 Discussion and Conclusion

The main idea of current AI algorithms is the combination of the data-driven paradigm with the knowledge-driven paradigms. The development of the AI-powered teacher assistant can be regarded as an attempt of utilizing such both paradigms to solve the practical problem in education. Based on the knowledge-driven paradigm, the principles and theories in psychological studies are employed to build the theoretical framework, which guides the machines to solve the targeted student behavior problem in a theoretical manner. By leveraging on the data-driven paradigm, the rich and precious teacher experiences embedded in the text data can be extracted and utilized. The integration of these two paradigms provides the solution, and it aims to ensure the reliability and validity of the developed teacher assistant for student problem behaviors. Specifically, the system can analyze students’ need deficiencies behind their problem behaviors and identify the corresponding external environmental and individual factors that result the deficiencies. It also helps teachers find answers or similar resolved cases in many typical student problem behaviors. By taking these answers and cases as references, the teachers can learn how to help their students. The system interacts with teachers through natural language, which greatly improves the usability as well.

One the other hand, we also note that it may cause certain concerns when such an intelligent agent is deployed in schools. People may worry whether it is ethical to utilize machines to analyze and even regulate the students. In practice, the developed assistant is used as a supporting tool offering advice and suggestions to teachers, rather than applying educational intervention directly to students. Another possible concern relates to the data privacy risk that students’ information will be leaked and abused. The developed assistant is designed with privacy protection inherently that it does not store any sensitive data of students after its usage. In addition, it is possible that the current version of the teacher assistant may misinterpret teacher’s descriptions, which results in wrong diagnosis and inappropriate advice. We plan to employ the explainable AI (xAI) techniques to show teachers how the developed assistant makes the current advice and how confident the assistant is on the given advice. The teachers then could make their own decisions on whether they would adopt the advice or not. Driven by the advancements of AI, especially the natural language processing and machine learning techniques, we believe the teacher assistant could eventually tackle such issues and eventually benefit both teachers and students in schools.