Introduction

Digital devices and multimedia learning environments are intertwined with modern society. Aesthetics featuring anthropomorphic attributes in cute and funny characters with personalized dialogues are evident in educational multimedia materials on social media, online video sharing platforms, and learning content management systems.

For instance, Kurzgesagt—a German animation studio, produces animations concerning scientific, technological, political, philosophical, and psychological subjects with a robust online presence on YouTube. The popularity can be attributed to the animated visual styles, which feature attractive, high-saturated color palettes and cute and funny anthropomorphic features.

Relatedly, the confluence between anthropomorphism and cute and funny expressive characters is related to the Kawaii design that imbues non-living or non-human objects with aesthetics accentuating rounded shapes, large eyes, and large foreheads to affirm baby-like, innocent, and childish qualities (Nittono, 2016; Nittono et al., 2012). Kawaii styles which are rooted in anime and manga, have been recently used for delivering educational contents (Hayashi & Marutschke, 2015; Raman et al., 2021). For instance, Cells at Work! is an anime that depicts cells and pathogens in a human body as anthropomorphic human characters with cute and funny emotional expressions and entertaining dialogues. Highly acclaimed for its entertaining and likable presentation of facts and concepts regarding the inner workings of the human body (Silverman, 2016; Valdez, 2018), the anthropomorphized animation has been used as homework assigned to learners studying biology in China’s Southwest University (Shen, 2018).

Studies have begun to examine the affective, motivational, and cognitive effects of embedding anthropomorphic features into multimedia lessons from the educational research viewpoint. Anthropomorphism through facial expressions and personalized dialogues in multimedia learning materials is associated with the emotional design principle (Schneider et al., 2016, 2018), which aims to evoke positive emotion in learners to promote motivation and learning without imposing additional processing load (Brom et al., 2018; Wong & Adesope, 2020). However, research on the emotional design in the multimedia learning context is in its infancy (Mayer & Estrella, 2014), warranting further studies to extend its effects with different cultures and learning domains (Brom et al., 2017, 2018; Wong & Adesope, 2020). Much research on emotional design features through anthropomorphism has been conducted with learners in Western nations (e.g., the U.S and Germany); whereas, comparatively less research has been done with learners in Asian countries (Brom et al., 2018; Wong & Adesope, 2020). Wong and Adesope (2020) noted that emotional design, including anthropomorphism, seems to enhance the intrinsic motivation of American and German learners with more prominent effect sizes than learners from other cultures. On the other hand, emotional design, including anthropomorphism, leads to a change of positive affect with more significant effect sizes among Chinese learners compared to learners from other cultures. These observations collectively imply that anthropomorphism in a multimedia learning environment can evoke different affective-motivational outcomes across diverse cultures. Thus, beyond emotional design research with US or European learners, this study conducted with Asian learners can contribute new insights concerning the confluence between anthropomorphism in a multimedia learning environment and cultural factors unique to Asian learners.

This study is also significant for its unique instructional domain presented by the multimedia learning environment. The multimedia lesson in this study featured an information technology topic, specifically, about how a distributed denial-of-service (DDoS) attack occurs. It is noteworthy that anthropomorphism studies have been mostly conducted to convey life sciences (biology-related subjects) and physical sciences (weather or meteorology) while underrepresenting other instructional topics such as information technology (Brom et al., 2018; Wong & Adesope, 2020). Because there is evidence indicating that the emotional design effects on motivational and learning outcomes may differ between diverse learning domains (Wong & Adesope, 2020), this study can contribute to the existing literature by clarifying the anthropomorphism effects in a multimedia learning environment presenting an information technology topic. From an educational viewpoint, this study aligns with the increasingly vital and relevant dissemination of information technology knowledge to learners in educational institutions and society as a whole, given the ubiquity of digital technologies across all aspects of life today. Hence, emotional design in a multimedia learning environment can make information technology subjects more accessible and beneficial to learners. In the context of the DDoS topic used in this study, anthropomorphism was accorded to learning objects depicting malware, bots, and servers through human-like images, i.e., facial features and limbs. Unlike previous research, this study also endowed the learning objects representing malware, bots, and servers with human-like dialogues (e.g., "I will trick the user! Click me! Click me!", "Keep attacking the server," and "Help! I am being kidnapped!") to solidify the anthropomorphism experience.

In light of the preceding, the objective of this study is to examine the affective-motivational and cognitive effects of anthropomorphism through human-like images and dialogues in a multimedia learning environment that delivers a lesson on how DDoS attack occurs. Pursuant to this goal, we conducted an online between-subject experiment involving learners from a large private Asian university. The findings of this study are discussed through the lens of cognitive-affective theory of learning with media (CATLM), integrated cognitive affective model of learning with multimedia (ICALM), and Cognitive Load Theory (CLT). These theories and related studies are further reviewed and discussed in the following section.

Literature review

Relevant theories on learning with a multimedia learning environment

The cognitive theory of multimedia learning (CTML) explains how learners learn from multimedia learning materials (Mayer, 2019), which is grounded on the following principles: (1) dual-channel—learners possess distinct channels for processing auditory/verbal and visual/pictorial information, (2) limited capacity—the cognitive architectures of learners are inherently limited concerning the number of information processed in each channel at any moment, and (3) active processing—learners need to select relevant information, organize the information into visual and verbal models in the working memory, and integrate these models with prior knowledge from the long-term memory in order to produce meaningful learning. The CTML is related to the Cognitive Load Theory (CLT), which describes three cognitive demand types of learners when engaging with multimedia learning materials: intrinsic load, extraneous load, and germane load (Deleeuw & Mayer, 2008; Mayer & Moreno, 2003; Sweller, 1994).

Intrinsic load is the cognitive resource used to process information inherent to the learning material, which is influenced by the difficulty of the learning topic and the learner’s prior knowledge concerning the learning topic. Extraneous load is the cognitive demand for processing elements surrounding the designs or formats of the multimedia learning presentation that do not contribute to the understanding of the learning subject. Whereas, the germane load is associated with learners’ utilization of cognitive resources when they acquire and construct schema in the long-term memory (Sweller et al., 1998), or when they willingly exert mental efforts to improve learning by applying learning strategies such as pattern-probing within the learning materials, reconfiguration of problem representation to facilitate solving, and metacognitive tracking of cognition and learning (Debue & van de Leemput, 2014; Schnotz & Kürschner, 2007).

Based on the triarchic model of cognitive load (Kalyuga, 2011), the intrinsic, extraneous, and germane load relate with the essential, extraneous, and generative cognitive processing, respectively (Mayer, 2017; Mayer & Moreno, 2003). A multimedia learning environment should be designed to mitigate the extraneous load caused by poor instructional design while managing essential processing that concerns the inherent complexity of the learning topic. Lastly, a multimedia learning environment should be designed to encourage generative processing, i.e., learners’ cognitive effort to understand the learning materials deeply through selecting, organizing, and integrating multimedia information.

Extending the CTML, the cognitive-affective theory of learning with media (CATLM) considers motivational and metacognitive factors influencing learners’ cognitive engagement in a multimedia learning environment (Moreno & Mayer, 2007). Hence, adding appealing pictures relevant to the instructional objective and challenging but guided learning situations can prompt learners to increase their efforts in selecting, organizing, and integrating multimedia information (Mayer, 2014a). However, careful consideration for infusing appealing features is warranted, as such a stratagem can also impair learning, mainly when the features are distractive or non-essential to the learning materials. Empirical evidence has shown that featuring "seductive details" such as illustrations, texts, background music, and audio effects that are interesting but irrelevant to the learning content can negatively impact cognitive load and learning performance (Rey, 2012).

Emotional design

The affective factor in the CATLM underpins the emotional design thesis. Emotional design refers to a range of design attributes that can evoke changes in affective and motivational states in learners to enhance learning (Plass & Kalyuga, 2019; Plass & Kaplan, 2016). To evoke positive affect, some emotional design uses warm colors (Mayer & Estrella, 2014; Plass et al., 2014), round shapes (Münchow & Bannert, 2019; Navratil et al., 2018), anthropomorphic images (Schneider et al., 2019; Stárková et al., 2019), positive facial expression of characters (Ba et al., 2021; Horovitz & Mayer, 2021; Liew et al., 2016; Plass et al., 2020), positive vocal characteristics of speakers/agents (Beege et al., 2020; Endres et al., 2020; Horovitz & Mayer, 2021; Liew et al., 2017, 2020), visual styles of agents (Ali & Hamdan, 2017; Segaran et al., 2021), and aesthetically pleasing fonts (Kumar et al., 2016, 2019). Crucially, the objective of emotional design is to incorporate affective-motivational aspects without imposing extraneous load. This can be achieved through minimal alteration of the representative features of the information while preserving the main learning contents—such that it “should not change (or should not change much) the number of informational units in a text, the number of elements in an accompanying picture, or the complexity of interactions among informational units and image elements” (Brom et al., 2018, pp. 102–103). The minimalist principle is notably upheld in emotional design featuring warm colors, as well as anthropomorphism.

The emotional design hypothesis is also informed by the Integrated Cognitive Affective Model of Learning with the Multimedia (ICALM) framework (Plass & Kalyuga, 2019; Plass & Kaplan, 2016), which assumes a separate channel for emotion processing apart from the cognitive processing of visual and verbal information—thereby postulating a learner’s mental model that encompasses emotional schemas and verbal and visual mental representations. The ICALM asserts that interplays of cognition and emotion can manifest during the selecting, organizing, and integrating process (for review, see Plass & Kalyuga, 2019). According to Plass and Kaplan (2016), the emotional design effects can be associated with the control-value theory of achievement emotion (Pekrun, 2006; Pekrun & Linnenbrink-Garcia, 2012), which describes a learner’s emotional state as an amalgamation of valence (i.e., positive and negative states) and activation (i.e., activating and deactivating states). Positive activating emotions such as enjoyment of learning is associated with enhanced learning and intrinsic and extrinsic motivation; whereas, deactivating negative emotions like boredom and hopelessness can lead to reduced learning, and intrinsic and extrinsic motivation (Pekrun, 2006).

Plass and Kalyuga (2019) outlined some ways in which emotion can influence cognitive load, thinking style, and motivation. For instance, learners’ emotional states, irrespective of positive or negative, can cause learners to focus on (1) their own emotional experience or (2) irrelevant information retrieved automatically as a result of the emotion. Such processing may compete for working memory resources (i.e., extraneous load); thereby, impairing learning performance (Pekrun, 2006; Pekrun & Linnenbrink-Garcia, 2012; Seibert & Ellis, 1991). Further, positive affect can act as a cue signaling that the needs and goals have been fulfilled, leading to more accessible cognitive resources during learning (Fredrickson, 2001) and increasing the tendency to adopt a more creative thinking style (Isen et al., 1987). Contrariwise, negative affect can signal that particular needs or goals are yet to be accomplished, thereby decreasing the amount of cognitive resources available for learning and impeding creativity (Isen et al., 1987) and learning performance (Pekrun & Linnenbrink-Garcia, 2012). Moreover, within the multimedia learning context, positive affect can increase intrinsic motivation, leading to enhanced learning performance (Moreno & Mayer, 2007; Plass et al., 2014; Um et al., 2012).

Anthropomorphic images as an emotional design feature

Among the different emotional design features, this study focuses on anthropomorphism. Brom et al. (2018) clearly outlined the conceptualization, operationalization, and boundaries surrounding anthropomorphic images in multimedia learning materials. Accordingly, anthropomorphism refers to adding facial features and expressions to visual elements in multimedia learning materials that are otherwise regarded as non-anthropomorphic. For instance, the seminal paper on emotional design imprinted anthropomorphic features such as eyes and mouths into graphical pictures depicting T-cells, B-cells, and antigens (Um et al., 2012).

It is argued that affixing facial features to non-anthropomorphic graphics would not lead to significant changes in extraneous load, given that face processing is considered spontaneous and automatic (Mithen & Boyer, 1996), and that a picture embedded with facial features should still constitute as one information chunk (Brom et al., 2018). Concerning its affective-motivational properties, anthropomorphism can transmit facial expressions for viewers to infer the emotional states of the graphical objects now attributed with familiar human-like cues (Epley et al., 2007). Consequently, the human-like elements may trigger higher social responses in learners to try to make sense of the learning materials, i.e., personalization effect (Mayer, 2014b; Schneider et al., 2016); while the emotional elements of the anthropomorphism can activate emotional states in learners through social and emotional contagion (Hatfield et al., 1993; Yuan & Dennis, 2019). The potential effects of anthropomorphism in evoking social or parasocial cues and responses among learners which influence affective, motivational, and metacognitive factors in the multimedia learning context are encapsulated within a recently proposed Cognitive-Affective-Social Theory of Learning in digital Environments (CASTLE) framework that emphasizes the role of social processes (which are influenced by different characteristics of the social cues) in the learning process involving attention, long-term memory, and working memory (Schneider et al., 2021). Moreover, the anthropomorphic features can be designed to conform to baby-face or "Kawaii" schema where round shapes, soft surface, prominent forehead, and big eyes are made apparent; as such attributes can evoke positive emotional responses associated with triggered smiles and activating of the reward system of the brain (Lorenz, 1943; Nittono et al., 2012).

Indeed, meta-analyses studies have shown that emotional design through facial anthropomorphism and pleasant colors can enhance positive affect, intrinsic motivation, perception of learning or effort while reducing perceived difficulty (Brom et al., 2018; Wong & Adesope, 2020). However, it is noteworthy that the emotional design effects can vary across studies due to potential moderating factors such as culture, learning domain, and pacing of the multimedia learning materials, as highlighted in the meta-analyses papers. Some studies on anthropomorphism effects in the multimedia learning context are reviewed, with the findings presented in Table 1.

The foregoing review informs some specifics of this study. First, anthropomorphism can vary in levels ranging from simple (e.g., rudimentary dots and lines to denote eyes and mouth) to complex (e.g., facial expression, detailed eyebrows, eyes, nose, mouth, and limbs) (Schneider et al., 2019; Uzun & Yıldırım, 2018). This study featured complex anthropomorphism by imprinting emotionally demonstrative facial features (e.g., shocked, sick, and "being dead" expressions), limbs, and weapons (i.e., swords and arrows) into the learning objects depicting malware, bots, and servers. The preceding conforms with Brom et al. (Brom, Hannemann, et al., 2016)’s implementation of "cute" and "funny" aesthetics and matches well with the narrative of how DDoS attack occurs presented by our multimedia learning environment. Secondly, from a theoretical perspective, this study follows Schneider et al. (2019)’s and Schneider et al. (2018)’s work in conceptually distinguishing and measuring cognitive load as intrinsic (i.e., subsumes processing of learning-relevant information that is influenced by the level of element-interactivity and prior knowledge), extraneous (subsumes processing of learning-irrelevant details like the design of learning environment), and germane load (subsumes generative processing contributing to schemata construction and understanding of the learning materials). As the use of contemporary cognitive load measurement such as Leppink et al. (2013)’s scale is scant within the emotional design research (Brom et al., 2018), this study can produce insights concerning the anthropomorphism effects on the three cognitive load facets through the aforementioned scale. Lastly, departing from the personalized pronouns (e.g., "you," "I," and "we") incorporated within the instructional text related to the anthropomorphic images (Schneider et al., 2018), this study endowed the anthropomorphic images with cute and funny human-like dialogues such that the anthropomorphic characters seem "alive" with intents and emotions (e.g., "I will trick the user! Click me! Click me!") to augment the anthropomorphism experience. The following section discusses the conceptualization of the human-like dialogues in the context of anthropomorphism in a multimedia learning environment.

Table 1 Studies on anthropomorphism via human-like images in multimedia learning materials

Human-like dialogues as a form of anthropomorphism

Besides facial features, incorporating personalized dialogues in a digital environment can also produce the anthropomorphism effects (Araujo, 2018; Sah & Peng, 2015). In general, personalized dialogues emphasize conversational speech, active voices, and personalized pronouns, e.g., "you," "I," and "we" (Kruijff-Korbayová et al., 2008; Sah & Peng, 2015). Conversely, formal/computer-like linguistic cues feature passive voice and nominalized terms that elude relating to the audience. As per the personalization principle (Brom et al., 2014, 2017; Mayer, 2014b), personalized pronouns convey social cues that promote learners’ social responses when engaging with the multimedia learning environments, leading to a deeper processing of the learning materials. Schneider et al. (2018) argued that the social cues derived from personalized pronouns could be regarded as human-like cues that produce the anthropomorphism effects—this premise aligns with human-computer interaction research indicating that human-like dialogues can accord anthropomorphism to artificial artifacts, e.g., chatbots or websites (Araujo, 2018; Sah & Peng, 2015). Experimental results demonstrated that anthropomorphism through human-like faces and personalized pronouns could elevate affect, motivation, and learning outcomes (Schneider et al., 2018).

However, this study diverges to some extent from the personalization premise, which embeds personalized pronouns within the instructional text or narrated materials, e.g., "you," "I," and "we," to address the audience. Specifically, this study devised and affixed human-like dialogues to the anthropomorphized learning elements to proffer the illusion that the elements are "alive." In other words, the anthropomorphism effect can be augmented by giving dialogue scripts to non-living entities within the multimedia learning material, which produces the illusion that the artifacts have human-like intents, emotions, and will. For instance, our study anthropomorphizes the learning element depicting the stolen data by incorporating the dialogue "Help! I am being kidnapped!" (see “ Human-like dialogues” section). Human-like dialogues can also infuse "cute/Kawaii" and "funny/humor" qualities into the anthropomorphized learning elements—such a stratagem is notable in Cells at Work! animation that portrays cells and pathogens in a human body as anthropomorphic personas expressing comical emotional and personality displays (Shen, 2018; Silverman, 2016). Collectively, the anthropomorphism effects and humor qualities afforded by the human-like dialogues may contribute to evoking positive affect and intrinsic motivation in learners engaging with a multimedia learning environment.

Research question and predictions

This study extends anthropomorphism in multimedia learning research to Asian learners and an information technology instructional topic; thus, acknowledging the call for new studies to feature learners of different cultural backgrounds and other learning domains (Brom et al., 2017, 2018; Stárková et al., 2019; Wong & Adesope, 2020). This study is unique, as cute and funny human-like dialogues were attached to visually anthropomorphized learning elements to accentuate the characters’ comical emotion, intent, and personality. In sum, this study aims to address the following research question:

RQ: To what extent can anthropomorphism through cute and funny human-like images and dialogues in a multimedia lesson on DDoS influence Asian learners’ positive affect [a], intrinsic motivation [b], cognitive load [c], and learning performance [d]?

Drawing on the CATLM and ICALM, this study predicts that the anthropomorphized multimedia lesson will enhance learners’ positive affect, intrinsic motivation, and learning performance than the non-anthropomorphized version. Following Brom et al. (2018)’s and Wong and Adesope (2020)’s meta-analyses, which collectively indicate a robust effect of emotional design on perceived difficulty, this study predicts that the learners engaging with the anthropomorphized multimedia lesson will have reduced perceived difficulty (i.e., intrinsic load and/or extraneous load) compared to the learners engaging with the non-anthropomorphized version. Based on the findings by Schneider et al. (2018) and Schneider et al. (2019), it is assumed that the anthropomorphized multimedia lesson can increase learners’ germane load than the non-anthropomorphized version.

Method

Research design

This study employed a between-subjects experimental design in which participants engaged with either the anthropomorphized or the non-anthropomorphized multimedia lesson. The experiment was conducted using an online survey tool, i.e., Alchemer, which contained the anthropomorphized or non-anthropomorphized multimedia learning animation, consent form, surveys, and post-tests for learners to complete. The automatic A/B randomization feature in Alchemer led participants to be randomly assigned to anthropomorphized or the non-anthropomorphized multimedia lesson.

Participants

Different levels of prior knowledge can moderate the effects of anthropomorphism in multimedia learning (Chiu et al., 2020; Schneider et al., 2019). This study strived to obtain a homogeneous sample of novice learners as participants. Undergraduates pursuing accounting and business studies at a large private Asian university that conducted courses in English were invited to voluntarily participate in the experiment to generate a sample of novice learners with the same level of prior knowledge (i.e., non-IT majors). The instructional topic on how DDoS attack occurs was related to a compulsory computer-related subject within the accounting and business academic programs. Nonetheless, because the process concerning DDoS attack was not covered in the syllabus, we could ascertain that the learners had low prior knowledge about the topic presented by this study’s multimedia learning environment. The invitation was done through an announcement posted on social media groups associated with the selected business and accounting courses.

A total of 85 accounting and business studies undergraduates responded to the invitation and participated in the experiment that was conducted online. However, we discarded participants who might have been inattentive or experienced technical issues during the online experiment based on the timestamp concerning the web page that conveyed the respective multimedia lesson. Specifically, among the 85 participants of the online experiment, 15 participants who stayed on the web page lesser than the duration of the multimedia lesson (i.e., 154 seconds) were dropped from further analysis; as it was deemed that they did not engage with the respective multimedia lesson entirely or at all. Hence, the data of the remaining 70 valid participants were used for the subsequent data analyses.

Among the 70 valid participants, 50 were females while 20 were males. The mean age of the participants was 23 years old, with a standard deviation of 1.092. The number of participants who engaged with the anthropomorphized multimedia lesson was 33, while the number of participants who engaged with the non-anthropomorphized multimedia lesson was 37.

Multimedia learning material and anthropomorphism design

The instructional topic used in this experiment was how a DDoS attack occurs. As per the learning outcomes, learners must be able to:

  1. 1.

    Describe the process of a DDoS attack

  2. 2.

    Explain the causes of a DDoS attack

  3. 3.

    Apply knowledge about DDoS to suggest ways an attacker can enhance the effectiveness of a DDoS attack.

  4. 4.

    Apply knowledge about DDoS to suggest ways to prevent the likelihood of a DDoS attack.

Two versions of the multimedia learning animation were developed using Adobe Animate: one with anthropomorphism (experimental) and the other without (control). Anthropomorphism design was operationalized via human-like images and dialogues, as described thoroughly in the following subsections. Apart from the anthropomorphic features on the images and on-screen texts, the anthropomorphized (experimental) and the non-anthropomorphized (control) animation were similar in all other aspects.

The spoken narration was produced using a modern text-to-speech engine which generated a female voice (Newscaster Vocalizer—Sonya, Neural Network, US English). It has been asserted that a modern artificial voice generator can be as effective as a human voice when applied to multimedia learning materials (Craig & Schroeder, 2017). The convergence of the narration and the animation yielded the anthropomorphized (experimental) and the non-anthropomorphized (control) multimedia lessons that lasted 154 seconds each. Appendix A presents the narrated content (Flesh-Kincaid Grade Level = 9.7).

The pacing of a multimedia learning environment can be categorized as either learner-paced or system-paced, depending on whether learners can control the multimedia lesson’s speed or pacing. The multimedia learning environment used in this study is learner-paced—the learners could pause and replay the animation embedded in the website (see “Procedure” section). We decided to use a learner-paced environment based on Wong and Adesope (2020)’s meta-analysis findings indicating that emotional designs had a greater influence on intrinsic motivation with learner-paced than system-paced learning materials.

Human-like images

The animation with anthropomorphism featured human-like visual attributes such as eyes and mouth on the images depicting the malware, bots, and servers. Following the incorporation of funny anthropomorphic faces in the multimedia learning material in Brom et al. (Brom, Hannemann, et al., 2016)’s study, we devised the facial attributes to look cute, funny, and expressive by rendering rounded eyes and mouths, dark sunglasses for malware, panic faces for stolen data, cartoonish sick expressions for computers under attack, and cartoonish "being dead" expressions for servers under attack (see Fig. 1). Besides, some of the learning elements depicting the malware, bots, and servers had limbs and weapons such as swords and arrows at specific points. The non-anthropomorphized version did not feature human-like visuals in the learning-relevant elements. Figure 1 illustrates the anthropomorphized and the non-anthropomorphized multimedia lessons.

Fig. 1
figure 1

The anthropomorphized and the non-anthropomorphized multimedia lessons

Human-like dialogues

This study supplied the anthropomorphized malware, bots, and servers with on-screen human-like dialogues to accentuate the anthropomorphism effects, comical expressions, intents, emotions, and personalities. The non-anthropomorphized version did not feature human-like dialogues; instead, the on-screen texts formally describe the actions within the DDoS process. It is noteworthy that the human-like dialogues and the formal descriptions conveyed the same essential information to the learners. Table 2 shows the human-like dialogues versus the formal descriptions associated with the learning-relevant elements between the anthropomorphized and the non-anthropomorphized multimedia lessons.

Table 2 The on-screen human-like dialogues versus the formal descriptions of the learning-relevant objects

Instruments for measuring dependent variables

Prior knowledge survey

The 5-points Likert scale survey asked learners to score on three items assessing their level of prior knowledge about the learning subject of DDoS, e.g., "I know how a distributed denial-of-service (DDoS) attack occurs." The final score of a learner’s prior knowledge was obtained by averaging the scores across the three items (α > .8).

Positive affect scale (PAS)

As used in prior studies (Park et al., 2015; Plass et al., 2014; Um et al., 2012); the positive affect scale (PAS) from the PANAS scale (Watson et al., 1988) measured the level of positive emotion experienced by the learners in the present experiment. The survey asked learners to report the degree to which they experienced ten types of positive feelings, using a 5-point Likert-type scale ranging from 1 (very slightly or not at all) to 5 (very much). A learner’s total score for positive affect was obtained by summing the 10 item responses. The PAS scale was administered to learners twice—first, before the engagement with the multimedia lesson to obtain the baseline positive affect; and second, after the engagement with the multimedia lesson (α > .8).

Intrinsic motivation scale

As utilized in prior studies (Plass et al., 2014; Um et al., 2012); the intrinsic motivation scale developed by Isen et al. (1987) measured the intrinsic motivation experienced by the learners in the present experiment. The 7-point Likert style scale consists of eight items; and a learner’s total score was obtained by summing the scores across the items (α> .8).

Cognitive load scale

The present experiment utilized the scale developed by Leppink et al. (2013), which distinguishes three types of cognitive load experienced by the learners on a 11-points Likert scale—intrinsic (3 items), extraneous (3 items), and germane (4 items). A learner’s final score for each cognitive load type was calculated by averaging the scores across the relevant items. For each cognitive load scale, the internal consistency was high (α > .8).

Learning performance posttests

Learning performance was assessed through the retention and the transfer posttest. The first retention question asked learners to state the reasons as to why an attacker would perform a DDoS attack on a server, with a maximum allocated time of 3 minutes. The possible answers presented in the learning material are monetary, fun, or political reasons, with one score awarded for each correct answer. The second retention question asked learners to describe the seven steps characterizing a DDOS attack on a server, with a maximum allocated time of 8 minutes—one score was awarded for each correct step as presented in the multimedia lesson.

The transfer test had three questions designed to assess deep learning performance, and the answers to these questions were not directly presented in the multimedia lesson. The first question asked: "If you are an attacker, how would you make a DDOS attack more effective? List down as many ways you can think of". Examples of acceptable answers are to put a Trojan virus in an unsecured website or to install a Trojan virus on the router. The second question asked: "If you are a server administrator, how would you prevent (avoid) a DDOS attack? List down as many ways you can think of". Acceptable answers include updating antivirus and firewall, promoting security awareness, or doing a routine check-up. The third question asked: "If you are a computer user, how would you prevent (avoid) the risk of your personal computers or other personal devices from getting infected and becoming botnets? List down as many ways as you can think of". Acceptable answers include ignoring suspicious emails, activating antivirus and firewall, or performing regular updates on antivirus. For each transfer test question, learners were given a maximum of 8 minutes to answer, with one score awarded for each acceptable answer. Scoring was done by three examiners blind to the conditions with a prepared scoring guide rubric. Any score difference was resolved through discussion until consensus was reached.

Procedure

Undergraduates pursuing accounting and business studies at a large private Asian university that conducted courses in English were invited to participate in the experiment voluntarily. The invitation was done through an announcement posted on social media groups associated with the selected business and accounting courses. Learners responded to the invitation and participated in the experiment that was conducted online. Figure 2 illustrates the experimental process from a learner’s viewpoint. The following describe in detail the online experiment:

  1. 1.

    The announcement invited potential participants to attend a virtual meeting (i.e., Google Meet) held on a specified date and time—that is, the participants engaged in the online experiment simultaneously. The announcement reminded potential participants to use desktops or laptops as well as headphones or earphones.

  2. 2.

    During the virtual meeting, the participants were given a link to access the online survey tool. Using scripting commands, the online survey tool was set only to allow access from computers and laptops, such that the online survey could not be accessed through mobile devices such as smartphones or tablets. This ensured that the viewing of the respective multimedia lesson could be done only through desktop or laptop screens; and thus, was not affected or confounded by small screen sizes.

  3. 3.

    Upon accessing the online survey tool, participants indicated their consents concerning the experiment and were then assigned to either the anthropomorphized or the non-anthropomorphized multimedia lesson condition through the automatic A/B randomization feature afforded by the online survey tool. The online survey tool was set only to allow one login per profile (as per their official university email account) to prohibit multiple attempts on the experiment. Moreover, the survey tool featured separate web pages that separated the sections of the experimental process; and it disallowed participants to access the previous web pages once they had reached a specific web page.

  4. 4.

    On the subsequent web page, the online tool instructed learners to use earphones for the upcoming multimedia learning video. In order to ensure that participants would be able to listen to the narration of the upcoming video, the online survey tool featured an audio test in which the participants must listen to the spoken word "Laptop." Participants must then type and submit the correct word to go to the following web page.

  5. 5.

    The web page instructed participants to complete the survey capturing their demographic details.

  6. 6.

    On the subsequent web page, participants were asked to fill out the prior knowledge survey about DDoS.

  7. 7.

    The following web page asked participants to fill out the positive affect scale to assess their baseline positive affect before engaging with the respective multimedia lesson.

  8. 8.

    Participants were then directed to a web page featuring either the anthropomorphized or non-anthropomorphized multimedia lesson (depending on the A/B randomized assignment). They were allowed to view and replay the respective video within a maximum alloted time of 15 min (with a countdown timer displayed on the web page). The web page would automatically transit to the next once the allocated time was up. However, a text instruction informed the participants that they could click a button to access the following web page if they believed that they had engaged with the multimedia lesson sufficiently; albeit once the next page was accessed, they could not return to the multimedia lesson web page.

  9. 9.

    The online survey tool then administered the following surveys to participants (in order by web pages): positive affect scale (to measure positive affect experienced after the engagement with the respective multimedia lesson), intrinsic motivation scale, and cognitive load scale (intrinsic, extraneous, and germane).

  10. 10.

    Participants were then asked to complete the following posttests by typing their answers in the column boxes (in order by web pages): retention question #1, retention question #2, transfer question #1, transfer question #2, and transfer question #3. The respective web pages allocated participants with 3 min to complete retention question #1. Participants were given 8 min before the respective web page automatically transited to the next for each of the other questions. In each of the abovementioned web pages, a statement was displayed to indicate to the learners that it was vital that they answer the posttests without obtaining answers online. The statement also assured the participants that their answers would not in any way affect their course grades. In this regard, participants had little or no incentives to acquire solutions from online or other external sources.

  11. 11.

    The last web page of the online survey tool debriefed, thanked and adjourned the participants. Additionally, the page provided the contact details of the first and second authors of this paper.

Fig. 2
figure 2

Flowchart of a learner’s experimental process

Data analyses

Descriptive and correlation analysis

Table 3 shows the bivariate correlations between the dependent measures, while the means and standard deviations of the measures are shown in Table 4. Expectedly, there was a positive correlation between a learner’s prior knowledge and retention performance—this aligns with the notion that pre-existing domain knowledge must be integrated with new multimedia information for learning to occur (Mayer, 2019; Mayer & Moreno, 1998). However, the correlation between a learner’s prior knowledge and transfer performance was not significant, plausibly due to the transfer questions were challenging, with the answers could not directly be obtained from the multimedia lesson. A learner’s baseline positive affect was positively correlated with intrinsic motivation, which is in line with the literature indicating the association between positive affect and interest in and enjoyment of an activity for its own sake (Isen et al., 1987; Pekrun, 2006; Pekrun & Linnenbrink-Garcia, 2012; Plass & Kalyuga, 2019). A learner’s baseline positive affect was positively correlated with germane load (measured as the perceived understanding of the learning material) and transfer performance, which supports the relevance of positive affect in cognitive processing within a multimedia learning environment per the CATLM (Mayer, 2014a; Moreno & Mayer, 2007) and ICALM (Plass & Kaplan, 2016). The extraneous load was negatively correlated with germane load and retention performance—this accords with the CLT theory asserting that the additional cognitive processing imposed on learners due to poorly-designed learning environments may lead to lower mental resources available for essential processing, thereby leading to decreased learning outcome (Mayer & Moreno, 1998, 2003; Sweller, 1994). Relatedly, the positive correlation between germane load and retention performance, and the positive correlation between germane load and retention performance, collectively render support to Leppink et al. (2013)’s scale in measuring learners’ cognitive load used for generative processing contributing to schemata construction and understanding of the learning materials that foster learning.

Table 3 Bivariate correlations between the measures
Table 4 Means and standard deviations of the measures

Control measures

Prior knowledge

A one-way ANOVA was conducted to assess whether learners’ prior knowledge differed between anthropomorphized and non-anthropomorphized condition. The result indicated that learners’ prior knowledge did not differ between the conditions, F(1,68)=.464, p=.498, η2=.007. Due to the significant correlations (see Table 1), prior knowledge was added as a covariate for further analyses concerning positive affect after the engagement with the respective multimedia lesson, intrinsic motivation, and retention as well as transfer performance.

Baseline positive affect

A learner’s baseline positive affect was measured through the PAS score reported by the learner before the engagement with the respective multimedia lesson. A one-way ANOVA was performed to compare the baseline positive affect between the anthropomorphized and non-anthropomorphized conditions. The result demonstrated that the learners’ baseline positive affect did not significantly differ between the conditions, F(1,68)=.23, p=.633, η2=.003. Due to the significant correlations (see Table 1), baseline positive affect was added as a covariate for further analyses concerning positive affect after the engagement with the respective multimedia lesson, intrinsic motivation, germane load, and transfer performance.

Time spent on engaging with the multimedia lesson

A learner’s time spent engaging with the multimedia lesson was captured through the timestamp duration of the web page that displayed the respective multimedia lesson. While there was a trend in which the learners in the anthropomorphized condition engaged with the multimedia lesson longer than those in the non-anthropomorphized condition, the difference was marginal as per the one-way ANOVA result, F(1,68)=2.985, p=.089, η2=.042.

Positive affect after engagement with the multimedia lesson

This study conducted a mixed ANCOVA with prior knowledge scores added as a covariate, anthropomorphism condition (anthropomorphized and non-anthropomorphized) as a between-subjects factor, and PAS scores obtained before and after the engagement with the respective multimedia lesson as within-subjects variables. The result showed that the interaction effect between the anthropomorphism condition and the PAS scores was significant, F(1,52.375)=5.643, p=.02, η2=.078. Analyses of simple main effects through repeated measures ANOVAs indicated that the PAS scores of learners increased significantly from the baseline positive affect to the positive affect experienced after the engagement with the respective multimedia lesson; regardless of the anthropomorphized condition, F(1, 288.545)=30.732, p=.000, η2=.49 or the non-anthropomorphized condition, F(1, 47.041)=4.945, p=.033, η2=.12. Based on the effect sizes (as reflected by the η2 values), it is notable that the anthropomorphized condition led to a larger increase of positive affect than the non-anthropomorphized condition. A one-way ANCOVA was conducted to compare the PAS scores reported by the learners after the engagement with the respective multimedia lesson between the two conditions. Controlling for the effects of prior knowledge and baseline positive affect, learners in the anthropomorphized condition experienced significantly higher positive affect after engaging with the respective multimedia lesson than those in the non-anthropomorphized condition, F(1,66)=4.736, p=.033, η2=.067.

Intrinsic motivation

A one-way ANCOVA with prior knowledge and baseline positive affect added as covariates was conducted to compare the intrinsic motivation scores of learners between the anthropomorphized and the non-anthropomorphized conditions. The result indicated that the multimedia lesson with anthropomorphism did not affect learners’ intrinsic motivation differently than the version without anthropomorphism, F(1,66)=.928, p=.340, η2=.014.

Cognitive load

A MANCOVA with baseline positive affect added as a covariate in the model was performed to compare the intrinsic, extraneous, and germane load scores between the anthropomorphized and the non-anthropomorphized conditions. The significance values produced by the Box’s Test of Equality of Covariance Matrices as well as the Levene’s Test of Equality of Error Variances revealed that the assumption of homogeneity of variance-covariance matrices and the assumption of the equality of variance for each variable were not violated. The result showed a significant difference between the conditions on the combined dependent variables, F(3,65)=5.209, Λ=.806, p=.003, η2=.194. Using the Bonferroni adjusted alpha level of .017, the post-hoc analyses concerning the between-subjects effect for each dependent variable demonstrated that only intrinsic load significantly differ between the conditions; specifically, the learners in the anthropomorphized condition reported significantly lower intrinsic load than those in the non-anthropomorphized condition, F(1,60.739)=14.031, p=.000, η2=.173.

Learning performance

A MANCOVA with prior knowledge and baseline positive affect added as covariates in the model was conducted to compare the retention and transfer scores between the anthropomorphized and non-anthropomorphized conditions. The significance values produced by the Box’s Test of Equality of Covariance Matrices and the Levene’s Test of Equality of Error Variances revealed that the assumption of homogeneity of variance-covariance matrices and the assumption of the equality of variance for each variable were not violated. The difference between the conditions on the combined dependent variables was non-significant, F(2,65)=.435, Λ=.987, p=.649, η2=.013. Using the Bonferroni adjusted alpha level of .025, the post-hoc analyses concerning the between-subjects effect for each dependent variable indicated that the anthropomorphized and the non-anthropomorphized multimedia lesson did not differently affect retention, F(1,66)=.797, p=.375, η2=.012, or transfer performance, F(1,66)=.026, p=.873, η2=.000.

Discussion

This study revealed that incorporating cute and funny human-like images and dialogues to anthropomorphize the learning elements depicting the malware, bots, and servers in the multimedia lesson on how a DDoS attack occurs can elevate the positive affect of learners. Specifically, while the learners in the anthropomorphized and the non-anthropomorphized condition experienced higher positive affect after the learning engagement when compared to their baseline positive affect; the anthropomorphized version led to a larger increase of positive affect as well as a higher sense of positive affect after the learning engagement than the non-anthropomorphized version. Surprisingly, the learners’ intrinsic motivation (i.e., interest in and enjoyment of an activity for its own sake) was not impacted by the anthropomorphic features, even though there was a strong positive correlation between positive affect after the learning engagement and intrinsic motivation. Collectively, these mixed results indicate partial support concerning the anthropomorphism effects on learners’ affective-motivational factor. The anthropomorphism in this study’s multimedia learning environment did not enhance the learners’ retention or transfer performance. Hence, from the CATLM (Moreno & Mayer, 2007) and ICALM (Plass & Kaplan, 2016) perspectives, imbuing the learning-relevant materials with cute and funny human-like images and dialogues did not boost learners’ intrinsic motivation; thus, the learners in the anthropomorphism condition did not devote additional cognitive effort to process the learning materials more deeply.

Some possibilities can account for the lack of anthropomorphism effects for increasing the learners’ intrinsic motivational states and subsequent learning performance. First, these findings could be attributed to the Asian learners’ tendency to be concerned about the effort and performance consequences regarding the learning task (Wong & Adesope, 2020). We regard this notion a valid plausibility—based on our long teaching experiences (>15 years), we have observed such a culture surrounding our university students (and also among learners in our nation in general) who overtly emphasize grades and performance goals, especially to compete among peers and to evade failures (Grant & Dweck, 2001; Kaplan & Maehr, 2007; King et al., 2012; Salili, 1996). Per the literature on achievement goals and intrinsic motivation, learners driven by performance goals than mastery goals are more susceptible to evaluative pressures, anxiety, and shame, resulting in less interest in and enjoyment of the learning activity for its own sake (Pekrun, 2006; Plass & Kaplan, 2016; Rawsthorne & Elliot, 1999). It is possible that because the study’s learners might have been driven more by their intense focus on their performance consequences during the learning task, they were less likely to derive additional interest in and enjoyment of the learning task for its own sake (i.e., intrinsic motivation) from the anthropomorphic images and dialogues.

Additionally, the null anthropomorphism effects on motivation and learning could be attributed to our study’s sample of non-IT major learners. The instructional topic presented by the multimedia learning environment was about how DDoS attack occurs, which was related indirectly to a computer-related subject within the accounting and business academic programs. However, because the subject contributed a relatively minor aspect within their accounting and business academic programs, this study’s learners might have felt that the DDoS topic was less familiar and less relevant to their accounting and business courses, and hence, were less likely to benefit in terms of motivation and learning from the anthropomorphic details. A similar observation was noted in a recent study (Shangguan, Gong, et al., 2020; Shangguan, Wang, et al., 2020). The researchers argued that the lack of familiarity and prior knowledge regarding the learning subject could obstruct learners from sustaining intrinsic motivation even when engaging with a multimedia lesson infused with emotional design. Our initial pilot study with sample learners pursuing IT courses supported this notion—it was shown that the anthropomorphized multimedia lesson on DDoS marginally elevated intrinsic motivation of the IT majors than the non-anthropomorphized version (Liew, 2021).

Moreover, Stárková et al. (2019) argued that learners’ familiarity with anthropomorphism design in the formal educational systems could influence the anthropomorphism effects. Specifically, the researchers found that anthropomorphism design had negligible effects on the learners’ affective-motivational states and learning performance among Czech learners, and thus considered that the learners were less accustomed to anthropomorphism attributes in the educational context than learners from other cultures (e.g., the US) because of their more formal schooling system. Similarly, the educational system experienced by our learners is formal, and hence, anthropomorphism design in learning materials is not a norm. Therefore, the lack of familiarity with anthropomorphism attributes in the educational context could have influenced the anthropomorphism effects on this study’s intrinsic motivation and learning performance.

This study revealed that the learners in the anthropomorphized multimedia lesson reported significantly less intrinsic load concerning the learning topic than those in the non-anthropomorphized version, cohering with recent meta-analyses indicating that anthropomorphism and pleasant colors robustly reduce perceived difficulty (Brom et al., 2018; Wong & Adesope, 2020). According to Brom et al. (2018), emotional design in anthropomorphism and pleasant colors can produce aesthetically pleasing materials, subsequently offering the illusion that the learning topics or materials are easier and require less effort to process (Salomon, 1984; Tractinsky et al., 2000). In this study’s context, the anthropomorphic cute and funny human-like visuals and dialogues depicting the malware, bots, and servers might have caused the learners to perceive the educational subject regarding DDoS as less "serious" and less challenging, which translated to lower intrinsic load (i.e., reduced perceived difficulty of the learning topic).

This study revealed that the anthropomorphized multimedia lesson did not cause learners to report higher extraneous load regarding the information conveyed through the multimedia lesson (i.e., the instruction regarded as unclear or difficult to understand) than the non-anthropomorphized version. Interestingly, this finding occurred despite this study utilizing complex-anthropomorphism featuring emotionally expressive facial features, weapons, and limbs on the learning objects representing the malware, bots, and servers. Complex-anthropomorphism has been shown to induce more task-irrelevant thoughts (e.g., higher extraneous load), especially among low-prior knowledge learners (Schneider et al., 2019). Moreover, the on-screen human-like dialogues within the DDoS topic did not cause learners to perceive the conveyed information as less clear or more challenging to understand. From the CLT perspective, we may argue that the human-like images and dialogues conformed with the minimalist thesis (Brom et al., 2018), insofar as the anthropomorphism attributes did not impose additional information "chunks" compared to the control learning elements devoid of the anthropomorphic details. This finding is in line with most studies that have found that rendering anthropomorphism into relevant images in the multimedia lesson generally would not inflict extraneous load or added sense of difficulty concerning the conveyed information on learners (Brom, Hannemann, et al., 2016; Park et al., 2015; Plass et al., 2014). On the other hand, it may be plausible that the learners did not regard the anthropomorphized multimedia lesson as unclear or difficult to comprehend because they might have been influenced by the cute and funny human-like visuals and dialogues into believing that the instructional topic was easy to understand (Brom et al., 2018) and hence, relatedly, perceiving that the instructional design were clear and comprehensible.

Within the anthropomorphism context, Schneider et al. (2018) and Schneider et al. (2019) relate germane load (also referred to as perceived understanding by the researchers) with generative processing efforts required to understand the essential learning materials (Kalyuga, 2011). Schneider et al. (2018)’s study found that incorporating anthropomorphism into decorative pictures (i.e., pictures that do not provide learning-relevant information) led to a higher germane load, suggesting that anthropomorphized decorative images can foster relevant cognitive development processes and schemata construction during learning. Conversely, while this study’s data demonstrated significant correlations between germane load and retention as well as transfer performance, anthropomorphizing learning-relevant elements representing malware, bots, and servers within the DDoS process did not differently affect the germane load of learners. Pending further work, one may suggest that when decorative pictures are used (Schneider et al., 2018), anthropomorphism can facilitate learning-relevant processing and perceived understanding of learners (germane load); but anthropomorphism in learning-relevant elements may not have a robust effect on the perceived understanding of learners (germane load). Conversely, cognitive load types are difficult to distinguish and measure through self-reported surveys (De Jong, 2010; Kalyuga, 2011), as highlighted by Brom et al. (2018), which could have influenced this study’s cognitive load findings. Regardless, our findings offer valuable insights concerning the cognitive effects of anthropomorphism in a multimedia learning environment, given that the utilization of contemporary cognitive load scale, e.g., Leppink et al. (2013)’s scale differentiating intrinsic, extraneous, and germane load, is scarce within the emotional design research (Brom et al., 2018).

The learners marginally spent more time engaging with the multimedia lesson imbued with anthropomorphism than the multimedia lesson without anthropomorphism. This observation could be indicative of anthropomorphism’s effects on learners’ situational interest. Also, it is plausible that the anthropomorphic elements in this study might have attention-capturing effects on learners, following an eye-tracking study that has demonstrated that anthropomorphism can attract learners’ visual attention (Park et al., 2015). Further investigation through eye-tracking studies can clarify the connection between the duration of learners’ fixation on the anthropomorphic features, duration concerning engagement with the multimedia lesson, interest, and learning outcome.

From an instructional design perspective, this study shows that multimedia lessons about IT-related topics can be anthropomorphized with cute and funny human-like images and dialogues. These features can be quickly rendered with minimal alterations into learning elements associated with information technology concepts and processes. Given that infusing cute and funny human-like visuals and dialogues can benefit positive affect and perceived difficulty, anthropomorphism in multimedia learning environments can support the dissemination and utilization of e-learning media regarding information technology knowledge, which is increasingly essential and applicable in today’s digital society.

Limitations and future directions

Some limitations of this research are acknowledged. The primary limitation of this work concerns the small sample size (approximately 30 per cell). A sample size of 60 per cell should be ideal for detecting a medium effect size. Moreover, while we strived to establish controls to the online experiment design (see “Procedure” section) to minimize confounding factors, the online experiment might still lack internal controls compared to laboratory-styled experimental design. In addition, the multimedia lesson had a short duration (154 seconds); thus, the anthropomorphic features’ affective, motivational, and cognitive effects beyond brief learning engagement were not ascertained. Hence, future research can be conducted with multimedia learning materials with significantly larger sample sizes, longer duration; or through longitudinal studies with repeated exposures to multimedia learning materials with anthropomorphism. While this study offers insights that extend the research base to Asian learners, the findings can not be generalized to other cultural backgrounds. Therefore, future research comparing anthropomorphism effects across participants with different demographic profiles or cultural backgrounds is warranted.

This study’s findings concern the anthropomorphism effects with novice learners possessing low prior knowledge of the learning topic (non-IT majors). As the literature has shown that prior knowledge of learners can moderate the effects of anthropomorphism (Schneider et al., 2019; Shangguan, Gong, et al., 2020; Shangguan, Wang, et al., 2020), future research should consider this factor when investigating anthropomorphism in the multimedia learning context. Emotional design research utilizing eye-tracking and emotion detection tools is scant despite its potential in providing unique insights beyond learners’ self-reported measures (Brom, Stárková, et al., 2016; Le et al., 2018; Park et al., 2015; Uzun & Yıldırım, 2018). Hence, future research on anthropomorphism can utilize such tools to clarify the anthropomorphism’s affective, motivational, and cognitive impacts in a multimedia learning environment.