Keywords

1 Introduction

Misinformation, toxic content, and online harassment have become serious problems in social media channels. This affects the younger generation in a specific way. Taking up this challenge, the Courage projectFootnote 1 aims at empowering adolescents to confidently interact and utilize social media. We take a multi-disciplinary approach building on psychology and pedagogy combined with data science and AI-driven approaches. Building on the tradition of Intelligent Tutoring Systems, we have developed a technical framework based on a Virtual Learning Companion enabling learning and interaction support with the aim to raise awareness and resilience on the part of the learners. The companion can be used in classrooms and informal settings providing a playful, adaptive and engaging setting in which adolescents interact with a social media environment under restrictions of pedagogical responsibility and guidance. One of these pedagogically motivated scenarios are narrative scripts that implement a collaborative learning flow pattern combined with counter-narratives to raise learners’ awareness through the externalization and sharing, empathy, and perspective-taking.

The companion implements playful adaptive educational strategies to engage and scaffold adolescents interacting with a social media environment under restrictions of pedagogical responsibility. An example is via narrative scripts [1], such as the “Pyramid app,” which implements a collaborative learning flow pattern combined with counter-narratives to raise learners’ awareness through externalization and sharing, empathy, and perspective-taking. This paper illustrates how these goals are materialized in a web-based learning environment comprising a controlled social media platform and the VLC [2]. It is supported by an AI backbone using transformer-based models for robust classification of media content according to risks and considering related educational needs. The basic version of the VLC environment provides an Instagram-like social media platform as a closed world with controlled content. We have chosen PixelFed as an open-source framework to simulate the social media environment. While PixelFed holds the content, the virtual learning companion (VLC) is implemented as a plugin for the Chrome browser.

2 Related Work

2.1 ITS and Companion

As a particular version of intelligent tutoring systems (ITS), learning companion systems (LCS) personalize support and adaptive feedback through an explicit and possibly human-like agent that interacts with the learner [3]. The agent or learning companion guides the learner step by step and usually assumes a non-authoritarian role. The interface may include multimedia, interactive buttons, menus, text, speech, animation, diagrams, virtual reality, or other interactive techniques.

LCS interfaces typically incorporate natural language processing (NLP) to facilitate communication between LCS and the student. Tracking the learner’s interactions with an LCS is part of learner modeling. For example, the LCS may ask the learner to explain the reasons behind answers as a reflective question for each step during the task. Learners could generate many explanations and articulate the reasons for their solutions that refine their understanding (self-regulated learning strategies) [4].

From the initial test of the VLC with learners in a simulated environment, the learners found VLC useful in determining reactions to the fact/fake news detection according to the learner’s feedback and judgments on the environment’s artifacts [5].

According to the definition of an adaptive system, one of the essential points for an adaptive tutoring system incorporating LCS is to respond flexibly to the learner’s actions depending on the context and progression. This response can be implemented in the conversation as feedback to the user’s text input, or it can respond to the user interaction in various ways, e.g., by suggesting a multimedia tutorial as a recommendation [6]. Modeling students as the basis for adaptive feedback in LCS tutorial dialogs can significantly increase learning gains for students with low and high prior knowledge [7]. The LCS can play many roles in an instructional context. For example, the role of a leader who suggests new ideas for learners to consider or the role of a critic who challenges learners’ suggestions [8].

In answering the question, “How competent should the companion agent be to meet the learner’s expectations and motivate the learner to continue working with the agent?” Hietala and Niemirepo [9] found that learners lose motivation when constantly using a strong and competent companion. Especially in the beginning, a companion who makes mistakes like a human is more effective. Nevertheless, introverted, and extroverted learners prefer knowledgeable and robust learning companions when faced with a challenging task or when dealing with a new topic.

Our virtual learning companion (VLC) supports features such as role-playing for users, providing adaptive feedback based on previous user interactions, scoring responses, and asking knowledge activation questions. In addition, analytics, and recommendations, which include taking input and displaying the information, are core functions of the VLC system.

2.2 Fake News and Challenges

Victoria L. Rubin defines a conceptual model for Fake News in the form of a triangle with the cornerstones Susceptible Host, Virulent Pathogen, and Promoting Environment. According to the model, Fake News spreads if and only if the three causal factors coincide. Three interventions propose to disrupt the interaction of the factors above: Automation to defuse the virulent pathogen, education to defuse the susceptible host, and regulation to make the enabling environment safe [10].

Automated AI-based approaches can detect toxic features and add labels to the information. An example of this approach is the LiT.RL News Verification Browser is a search tool for news readers, reporters, editors, or information professionals. The tool examines the language used in digital news websites to determine whether clickbait, satirical news, or Fake News. The classification provided by LiT.RL is not perfect and is not always suitable for public use. Multimedia formats are not supported [11].

Automated message validation systems based on NLP techniques may be helpful to assist content authors quickly checking standard features of misinformation. In addition, such systems could help teachers teach critical content evaluation skills or help information professionals reduce information overload for news consumers by filtering and flagging suspicious news [12].

The labeling strategy currently used on social media responds to any keywords that could potentially be Fake News (e.g., COVID -19 or vaccine)Footnote 2. On the other hand, tracking and detecting fake content is tedious. For example, the deep manipulation of fake images makes it difficult for machines and humans to see fake content [13].

Another strategy to mitigate the vulnerable host and help learners develop their mechanisms to detect and counter such influences is education [10]. This approach can be divided into two main strategies: Raising awareness and improving learner bias. There are adventurous techniques to “improve bias and awareness”. For example, games and practices show learners how to use libraries and search engines to find truth and credible information from different contexts. To improve learner bias, we can also provide games that show how Fake News content creators create content on social media and what types of content are typically classified as misinformation.

To improve learners’ critical thinking rather than working on their biases, we can provide learners with a “Fact Checking Awareness Tool.” Such tools are designed to motivate learners to fact-check. According to the American Library Association (ALA), access to correct information from different contexts, without censorship and filtering, is the best way to counter misinformation and media manipulation [32].

The last point in the triangle is “regulation”. The legislative process should be methodically proactive and robust to prevent pathogenic “fakes” from reaching and infecting susceptible hosts in a conducive digital environment. The EU Commission has warned that social media companies face the threat of new regulations if they do not take prompt action against fake newsFootnote 3. Several programs have been launched under the leadership of the EU and international organizations to tackle disinformation related to the June 2020 pandemicFootnote 4. The inventor of the WWW, Tim Berners-Lee, made the following comments at a recent Web Foundation conference: “Administrations must adopt rules and regulations for the digital age. They must ensure that markets remain competitive, innovative, and open. Moreover, they are responsible for protecting people’s rights and freedoms on the Internet”Footnote 5.

2.3 Integrating Machine Learning and Gamification into Learning Environments

To improve the efficacy of digital media literacy activity, gamification strategies could be a viable approach as technological advancements allow for more digitized learning environments. An evaluation of the integration of virtual reality tools shows promising results in terms of engagement of participants and learning outcomes [14]. Given the restrictions imposed by COVID-19, the development of gamified teaching tools can be seen to provide knowledge and enhance students’ collaboration during social distancing time.

A review by [15] finds that in particular STEM disciplines adopt this approach, but much should be done to improve the efficacy of those activities and maximize the participation rate by students. The challenges that gamification strategies need to address involves the relationship between user characteristic and engagement level but also how different knowledge’s level impacts the design of the games.

In terms of platforms’ design, [16] highlight the potential benefits that virtual learning companion (VLC) can get from adoption of AI– powered personalization of learning path, given a proper ethical oversight mechanisms related to ensure good recommendation of learning content and moreover VLCs should improve learners’ personal autonomy, and benefits brought by the VLCs should outweigh risks.

3 Architecture

Fig. 1.
figure 1

Conceptual architecture of the Virtual Learning Companion system

The above architecture shows a web-based learning environment with a controlled social media platform and the VLC It is supported by an AI backbone that, for example, uses transformer-based models to robustly classify media content according to the risk and consider related pedagogical requirements. The basic version of the VLC environment provides an Instagram-like social media platform as a closed world with controlled scope. We chose PixelFed as the open-source framework to simulate the social media environment. While PixelFed contains the content, the VLC is implemented as a plugin for the Chrome browser. As shown in the architecture, learners can interact with both environments and receive prompt feedback from VLC during the scripted chat dialog guided by the Narrative scripts. Learners are instructed through (individual and collaborative) tasks that require judgments and comments on the information presented, typical of the image-based social media format (images with short text captions). The tasks are guided by a chatbot with narratives and adaptive counternarratives that challenges students’ believes. A unique feature of the VLC is based on using “reverse imaging search” (RIS) engines that retrieve the same or similar images from different sources [5]. Learners are asked to judge whether a posted image and its caption are credible or fake by comparing the corresponding content and keywords retrieved by RIS from different sources.

The machine learning backend is integrated with the VLC framework as a separate, standalone API and provides support for threat detection through algorithms that can automatically analyze (social) media content. Several AI models have been developed as part of the project, and we will give a more detailed overview in Sect. 5 In general, the classifiers and content analyzers are focused on textual content and images, as these are the main contents of social media. Prediction results can be used to raise awareness of threats and support learners’ education during their VLC interaction.

4 Implementations

In this project, to cover the different analyses from the COURAGE group, the modular model was suggested for the frontend and backend sections. On the server side, microservice architecture enables us to connect different APIs and services according to the empirical scenarios for the school’s experiments. The frontend part has two main core features that allow the VLC to be used as an add-on for the browser (Chrome Browser plugging) and an easy-to-use chatbot that gets the conversation in the JSON format.

For example, in fake or fact scenarios in which the companion suggests related images from the web, the google API was operated to find similar images. This feature was disabled in the VLC for racism scenarios which aware the students of racism and discrimination in social media. The Modular architecture impower the system to be flexible and new component as (NLP API) add to the system [5].

5 VLC Technical Architecture

Fig. 2.
figure 2

Technical architecture of the courage companion for scenario that includes RIS module

As mentioned in the Chrome extension were utilized as a browser plugin was implemented, and the ReactJS library was used to develop the chatbot inside the plugin, which can increase modularity (Fig. 2).

In the backend (server side), NodeJS and MongoDB store user interactions and states. Learning Locker is used for logging and visualization. We also implemented the Wit.ai machine learning tool to estimate user intentions. We have a data analysis dashboard to visualize the recorded logs and interactions in different forms (including bar charts and pie charts).

5.1 Natural Language Processing and Machine Learning API

As mentioned in Sect. 3, the NLP and machine learning part of the project is integrated as a separate, standalone API in the overall VLC framework. To give a brief overview and clarify how this intelligent backend can contribute to social media threat education, we are describing the individual components and the deployment process of the API below.

In general, the structure can be divided into three main components, starting at the most basic level with (1) the trained models and code to run inference, going over to (2) a framework to define model endpoints by wrapping up everything into an API and finalizing it by providing (3) support for deployment on any server.

All implementations of (1) are based on python and we make use of well-established natural language processing frameworks, e.g., Hugging Face and machine learning frameworks, e.g. PyTorch and TensorFlow to build and train models and use them for making predictions. As also mentioned previously, these models are focusing on the analysis of textual content and images. Many of the integrated models are self-developed while we are also adopting some from literature, depending on their quality and required special use-case. The text analyzers can identify:

  • Sentiment (English [17], Italian [18], Spanish [17], German [19])

  • Emotions (English [20], Italian [18], Spanish [21])

  • Hate speech (English, Italian, German [22])

  • Fake news (English [23], German [24])

  • Irony (English [25])

  • Sexism (English [26])

In terms of image algorithms, we provide a self-developed model that can predict the body mass index (BMI) of persons based on a picture of their face which for example is helpful to identify threats arising of beauty stereotypes. In addition, we adopt models for gender identification [27] and object detection [28] as they support the identification of meta information in the overall image setting.

Fig. 3.
figure 3

Conceptual architecture and deployment process of the ML/NLP API

Figure 3 shows the conceptual architecture and the structuring of the deployment process of the NLP/ML API. The yellow box represents the algorithmic side as just described in (1).

Continuing with (2), the endpoint definition and API wrapper implementations are again based on python, and we are using the Flask framework to add routes or webapps to the backend. These establish the link to the algorithmic side and make AI models ready for access via http requests.

Finally, in step (3) the deployment is managed using Docker which requires a Dockerfile and allows to predefine python package and version requirements which makes it easy to deploy the API on any server without the need for prior installations.

Overall, our NLP/ML API is an easy-to-deploy, standalone application that can be used to analyze textual and image content containing state-of-the-art AI algorithms and we for example utilize it to identify potential threats on social media as part of the VLC framework as depicted in Fig. 1.

6 Educational Scenarios

Another Scenario that introduces the NLP/ML API is focusing on making a regular feed on social media more transparent to support users in identifying potential threats. For that, we are developing a web-interface mimicking Twitter and augment posts inside the timeline with additional information, e.g., automatic AI analysis results produced by the NLP/ML API. An example of this tool can be seen in Fig. 4.

.

Fig. 4.
figure 4

Screenshot of an augmented feed in the Twitter scenario

The goal of this scenario is to clarify how automatic decorations are perceived by the users and if they support them in identifying threats. To measure these things, an in-between group study presenting this demo interface to one of the groups will be designed and the user performance on predefined tasks can be compared to their behavior using a regular feed without these decorations.

To enable the analysis of learner activities, all user actions are captured in a learning record store (Learning Locker) based on the xAPI description format. This includes interactions with the VLC as well as certain actions in the PixelFed space (e.g., selecting images). This architecture allows for aggregating action logs from different sources. A dashboard allows for analyzing and visualizing the logged actions.

The current version of the VLC environment provides learner guidance through scripted interaction and contextual information prepared for the closed environment.

To create an “open” version of the VLC, which will enable interacting with real social media environments, we are currently adding intelligent components for detecting toxic content and for analyzing learner comments. To be of actual practical use it is paramount that any such classifiers perform well enough. To achieve this, we incorporate state-of-the-art approaches that we have developed as part of the project and which have been demonstrated to be robust and competitive across classification tasks (e.g., toxic comment and fake news detection) and languages (Italian, German, English), e.g., [4]. We are also actively pursuing the possibility to use GCN based detectors which should allow us to flexibly integrate contextual information from different sources [29].

7 Conclusion

We report on the efforts made by Project Courage to mitigate and overcome this toxic social media content through a special education technology inspired by psychological and pedagogical approaches. The goal is to implement a series of tools based on the scenarios leading to empower young people to interact with and use social media confidently and to increase their awareness and resilience such tools were implemented in the similar works utilizing chatbot in the learning procedure in the social media like the i-LearnC# and FLOKI approaches [30, 31].

This approach presented a modular technical architecture for the web-based companion system. Modularity empowers us to be flexible and add new features and technical components based on the design with a minimal logical process. The Plugin can be employed in a controlled environment like PixelFed instance and open social media like Instagram and Facebook. The main components of the VLC are NLP/ML API, RIS module, and Narrative scripts. These modules can be enabled according to the learning scenario and will deliver to the user via interaction with the companion chatbot.