Keywords

1 Introduction

Before the outbreak of COVID-19, there were already many online psychotherapeutic applications, and these psychotherapeutic applications were initially consistent with the level of off-line therapy. And it also provides convenience, patients can use it at any time; at the same time, the protection of privacy makes more users willing to actively participate. But relevant doctors are still relatively slow in adopting these tools on a large scale. With the outbreak of COVID-19, medical departments around the world are under tremendous pressure for medical consultations. In fact, COVID-19 not only damages the health of patients, but also the mental health of others by the pandemic [1]. Not only patients and the elderly, many young people and even children also suffer from conditions such as fear, sadness and depression. Psychological trauma. As COVID-19 has caused quarantine and lockdowns in various places, people cannot meet with family and friends, further increasing the possibility of psychological trauma, making it possible for people who were originally normal and healthy to fall into mental illness, and at the same time they cannot realize that this is. A disease and not just an emotion.

These phenomena have led to a huge demand for online psychiatric outpatient systems, whose role is to relieve the pressure on outpatient clinics of medical institutions and provide contactless medical services. The online medical inquiry chatbot system based on artificial intelligence technology can provide online mental medical inquiry. Its key technology is the knowledge graph of the medical field. The system relies on entities in one or more fields and performs reasoning or deduction based on the spectrum. Answer the user’s question.

The impact of the pandemic on the mental health of children and adolescents was showed in [2], particularly depression and anxiety. It first revealed that countries paid less attention to adolescents’ mental health during the pandemic, listing an example of the reduction in beds in hospitals. It also illustrates the COVID-19 pandemic makes it harder to detect adolescent’s abnormal behaviors by recommending a reduction in contacts and outdoor activities, leading to a decrease in the number of appointments. The level of anxiety becomes harder to assess, and adolescents get anxious more readily. And a solution is: to help patients with anxiety and depression online. It’s proved to be useful to have internet-based care by randomized controlled trials, which provides a strategy for healthcare workers and patient’s parents. Online resources like recorded courses, group treatments, and mental health apps provide direct access to instructions for children, which is better suited to the current situation than appointments. Parents far away from their children can have increase care for them and report the abnormalities to doctors, which is helpful to make a diagnosis. Finally, the paper suggests healthcare counselors demonstrate altruism in front of their patients, and stresses the importance of an optimistic mood in the treatment.

In these mentally ill groups, because they have to study online at home, they have broken away from the original traditional teaching mode and cannot have face-to-face communication with teachers and classmates. This has further increased the pressure on young people to study; in addition, due to the fact that they are in the family with their families. The time spent living together has increased, and the relationship between some teenagers and their families has become more tense, which has led to an increasingly serious problem of teenagers’ psychological anxiety. At present, scales are commonly used in the evaluation of mental illness in hospitals, which is to evaluate patients through questionnaires. This method may be flawed in the evaluation of mental illness of young people, because compared to adults, young people may be more rebellious. When they are unwilling to undergo psychological tests, they may falsify answers or know how to get high scores based on experience, and avoid being judged. For mental illness.

This paper proposes one kind of chatbot method for the diagnosis of adolescent psychological anxiety. The chatbot model is based on a multi-modal seq2seq model, which is used to analyze the multi-modal interaction data such as text and image when the teenagers were using their chatbot. Experiments show that this structure could reach 71% training accuracy and 63% test accuracy on the existing multi-modal dataset. Preliminary real user tests show that it is correct on the psychological anxiety judging of 15–18 year-old teenagers.

2 Chatbot for Teenager’s Depression

A study showed that the physical environments of house settings are more proximal to adolescents, and they have impacts on children’s prefrontal cortex (PFC) growth which extends well in children’s lives. SES (socioeconomic status) may be correlated to the physical environment of families. A hypothesis that a less-resourced environment leads to a thinner PFC has been made by the author’s group [3]. The group conducted in-home interviews with testers, meanwhile scored items in their houses as environmental scores (PHYS test). Hazards, space, noise, cleanliness, interior and external environment are factors assessed. They also collected their brain scan images at UCLA. To make appropriate control of testers, the group tested the basic nurturance and stimulation of the child based on a scale from 0 to 10(SHIF test). All scores ranged from 7 to 10, which provided control of developmental contexts. Ethnicity, educational context, gender, and age are also tested. Cognitive test WRAT revealed the tester’s reading, understanding, and math computation skills. The scores were reported as reading scores and mathematical computation scores. To test the relationship between SES and physical environment, the group asked for reports from families about their total income and family household sizes. Testers were divided into five groups based on their data of depth of poverty, and the group used income-to-needs ratios (INR) to report their economic status. The group finally compared MRI surface area maps of testers with standardized size maps to get the effect size value (standard deviation difference), and then used the value to get the conclusion of the thickness of PFC. The group finally used mediation analysis of PHYS, SHIF, WRAT scores, and INR to test their relationships. The comparison showed that adolescents whose parents had more incomes tended to have a better physical environment at home, and they had higher cognitive skills in math. PHYS and SHIF scores were directly proportional to the thickness of the left lateral occipital gyrus, and the WRAT score was positively associated with the thickness of the left frontal gyrus. After mediation analysis of whether PHYS can predict WRAT reading scores, the left superior frontal gyrus was the area associated with PHYS and WRAT reading scores. To sum up, the group concluded that the physical home environment determined the adolescent’s reading achievement, and the thicknesses of middle and superior frontal gyri were negatively related to the number of physical problems in the home environments.

The mental health problems from six groups [4]: General population, healthcare personnel, college students, schoolchildren, Hospitality, Sport, and Entertainment industry employees, and others. A series of concerns lead to the abnormal mental health of the general population: Possible disease spread, fearless of ill, financial loss due to unemployment, the uncertainty of test results, and death of family members are all factors that lead to mental health problems in the general population. The healthcare personnel (front-line healthcare workers) experienced the highest level of anxiety and depression. Close contact with patients may make them the source of infection to family members. Intensive works and the possibility of an emergency made them nervous all the time. As a result, they were more likely to have developmental disorders. College students had concerns for their safety and the safety of their families during the pandemic, which led them to have mild anxiety. Lots of part-time jobs and the obstacle to have remote online classes also caused mental stress. The closure was the biggest problem for school children (primarily adolescents). Due to the pandemic, students were needed to stay at home to have online classes, and this led to a lack of activities, disrupted sleeping habits, and loss of resources. Students were struggled to study at home and developed lockdown situations, which were hard to adjust back to normal. For employees in the hospitality, sport, and entertainment industry, the economic strain was the primary reason that led to their stress. A ban on gathering would be a part of modern life after the pandemic, and this led employees to lose their jobs permanently. As a result, they would have mental health problems. As for vulnerable groups (Elderly people, homeless individuals, care homes residents), they already had some chronic diseases (mental disorders like bipolar and diseases like asthma) which made them more likely to get infected.

One research aims to find the reciprocal relationships between excessive internet use and school burnout [5]. The research first shows a school burnout that the engagement of students in Finland decreases because the classroom is in lacks digital devices. Students who used digital technologies felt bored. The school burnout was comprised of exhaustion, cynicism, and a sense of inadequacy. Compared to engagement, it predicts depressive symptoms. School engagement is defined by energy, dedication, and absorption. The research showed a method to increase engagement: Fulfill adolescent’s socio-cognitive and emotional needs. School climates and motivation from others are also factors that lead to positive engagement. To start the research, 1702 elementary students were asked to answer a questionnaire about engagement, burnout, internet use, and depression at two different times. EDA, SBI, and DEPE depression Scales were tests that correspond to engagement, burnout, and depression respectively. SES and gender were additional measures. The results of the questionnaire showed that internet use and school burnout are reciprocal positive cross-lagged related. School burnout leads to excessive internet use and depressive symptoms. In components of school burnout, cynicism predicted later inadequacy and inadequacy predicted later cynicism. Exhaustion increased excessive internet uses. Study 2 focused on high school students instead. Using the same method in study 1, researchers found that girls suffered more from depression and school burnout, while boys were suspected of excess internet use. And, exhaustion was found to lead to an increase in internet use. The research showed that the negative attitudes of students may be formed at elementary school, which transformed into school burnout and thus led to excess internet use. About the solution, researchers ask people to promote students to have positive attitudes when they were young.

Fig. 1.
figure 1

Tess chatbot of a participant interacting [7].

An overview of the neurobehavioral changes during adolescence and the impacts of stressful environmental stimuli had on maturation was proposed in [6]. In the first category study, the researcher found that rodents had a higher level of anxiety-like behavior. Rodent’s social abilities dropped and their aggression increased. In rats, researchers also observed significant depressive-like behavior, which included high immobility. A specific rodent, mice, formed a depressive-like phenotype when exposed to stress for 10 consecutive days, accompanied by anxiety and lower body weight gains. The social instability stress (1 h isolation per day and then live with a new roommate which PD value was 35 to 40) exerting on mice found that they were more sensitive to drugs like nicotine when they were adults. Additionally, the paper shows that social experiences influence drug-seeking behavior. The paper showed that stress reactivity, mineralocorticoid receptor expression, and glucocorticoid receptor expression changed significantly. In adulthood, HPA activity rose, and the reactivity to stressor increased. Above is the growth of the HPA axis in adolescence. Then, it discussed the impact of stress on the HPA axis growth. Social isolation caused lower corticosterone responses level to stress in adulthood-males, and females had more corticosterone responses. The study showed that adolescents were risk-takers at this time due to the imbalance in the growth of limbic and conical compartments. Immaturity of the cortical region led to novelty-seeking behaviors. And, adolescents were sensitive to rewards, which promote risk-seeking.

Chatbot is an application that can conduct text or voice conversations [7]. Studies have shown that the communication between users and chatbot is also very effective in providing psychological or emotional problems. Woebot is a chat bot that can conduct automatic conversations. While communicating emotions, it also tracks changes in emotions. Tess is an intelligent emotional chatbot, which is shown in Fig. 1, whose method is to find the user’s emotion and provide solutions through dialogue with the user. In a study Tess provided emotional support to 26 medical staff, most of these users reported that Tess had a positive effect on their emotions. At the same time, Tess can also reduce the anxiety of many college student volunteers, and can even manage adolescents’ depression-related physiological phenomena. The KokoBot platform is an interactive platform for evaluating cognitive abilities. The main feature is that it can conduct point-to-point interaction, and users on the platform can also communicate with other users. Wysa is an emotional intelligent mobile chatbot based on artificial intelligence. The goal is to assist mental health and relieve psychological stress through human-computer interaction. Vivibot’s chatbot serves the mental reconstruction of terminally ill teenagers who are undergoing treatment. Pocket Skills is a conversational mobile phone chatbot, mainly responsible for behavior therapy.

3 Multi-modal Seq2seq Model

The information sources that humans interact with the outside world include tactile, auditory, visual, etc., and the resulting media used to carry information includes voice, image, video, text, etc., microphones, cameras, infrared, etc. are sensors responsible for collecting information. The combination of these diverse information can be called multi-modal information. A single modality often only carries the information of its own modality, which has certain limitations. The relationship between each modality can be fully studied through machine learning and other means. Multi-modal is also one of the current research hotspots. Multi-modal methods mainly include Joint Representations and Coordinated Representations.

Multi-modal methods mainly include Joint Representations and Coordinated Representations. As shown in Fig. 2, in the multi-modality, the text processing can use sentence summaries, the purpose is to use the seq2seq model to form short sentence content. In machine translation applications, multi-modality can also be used, and its effect is better than simply using a single text input, which means that images and text sentences need to be input at the same time, and the image needs to be able to describe the text sentence [8].

Fig. 2.
figure 2

Multi-modal model predicts the event objects [8].

The current multi-modal learning is generally based on the deep learning framework. The latest technology is mainly based on the BERT architecture. After pre-training by means of pre-train and transfer, it is applied to other tasks, such as image subtitles, etc. These tasks only require Minor changes [9].

This paper proposes a chatbot method for diagnosing the psychological anxiety of adolescents. The chatbot model is based on the multi-modal seq2seq model. The specific structure is shown in Fig. 3, where the image caption technology is used to extract the text description of the image at the front end of the model, and the attention mechanism is used in the multi-modal model to control the associated part of the image and text, which is used to analyze the use of chat by teenager’s multi-modal data such as text and images during the chatbot.

Fig. 3.
figure 3

Multi-modal seq2seq chatbot.

4 Experimental Result

In order to find the effectiveness of the structure proposed in this article, we selected part of the Microsoft COCO Caption data set [10] and LCSTS data set [11], which are merged with own chatbot image and text dataset and conducted training and testing. The user fills in the standard psychological scale as the ground truth of data. In evaluating the degree of user anxiety, we divide the degree of anxiety into 0–5 levels, which correspond to 0%, 0%–20%, 20%–40%, 40%–60%, 60%–80% and above 80% anxiety level of the user in the overall ranking.

Table 1. Comparison of the indicators on training dataset
Table 2. Comparison of the indicators on testing dataset

The experimental results are shown in Table 1. The results on the training set have an average accuracy of 71%; in the test, k-fold cross-validation is used for verification, and an average accuracy of 63% is obtained. In comparison, the results of TF-IDF Decision Tree on the training set are 63% average accuracy, and the results on the test set are 58% average accuracy; the results of LSTM are 69% training set average accuracy and 61% respectively. Average accuracy of the test set (Table 2).

Finally, five teenagers aged 15–18 years old were invited to test the chatbot. 3 of the 5 teenagers had a more anxious mental state. Using this chatbot, they obtained results consistent with their own cognition.

5 Conclusion

With the outbreak of COVID19, teenagers who study and live at home are more likely to suffer from mental illness and anxiety symptoms. This paper proposes a multi-modal chatbot scheme, which analyzes and judges the mental state of teenagers when they use chatbot through multi-modal information such as text and images. The model is a seq2seq model, which combines image text description extraction and text summarization modules, and uses an attention mechanism in a multi-modal model to control related content in different modalities, and is used to analyze text and images when teenagers use chat bots and other multi-modal data. Experiments show that this structure can achieve better accuracy on the existing multi-modal data set, and it has also received better feedback from real users.