Introduction

Comics are becoming more and more popular as teaching material (Farinella 2018; Bach et al. 2018; Tatalovic 2009; Bentz 2020; Tribull 2017) and many studies detect distinct advantages of comics compared to other teaching material in the domains of understanding, retention of knowledge, environmental concern, and willingness to change behavior (e.g. Wang et al. 2019; Aleixo and Sumner 2017; Topkaya 2016; Theodorou et al. 2018). However, none of these studies have examined learning with comics in the context of Education for Sustainable Development (ESD). ESD, in its combination of natural and human sciences, is of an inherently geographical nature, particularly when treating spatially relevant goals such as increasing climate action, alleviation of poverty, and fighting hunger. These kinds of topics touch on global phenomena with local impacts and, vice versa, patterns of dispersion, allocation of resources, and spatial relations between humans and other stakeholders on the surface of this planet, thus making them core geographical problems. This makes geographical communication devices particularly relevant for ESD and climate change education. Climate change education has become somewhat of a buzzword in the past few years, and has a global significance, with UNESCO establishing networks, webinars, and prizes. Naturally, climate change plays an important role in ESD. This is a complex topic, posing a challenge to many teachers, especially with young learners. Traditional text book material often fails to meet the need when it comes to teaching the scientific mechanisms of climate change in conjunction with the social impacts they imply (e.g. Lütje and Budke 2022, p. 51). An understanding of both is necessary to be able to relate to the measures we have to take in mitigating climate change. This is a complex goal, but the urgency of educating upcoming generations about one of the biggest challenges of our times dictates the need to find solutions. In the present study, we want to investigate the potential of an often underestimated medium, namely comics, for teaching about climate change. We want to know whether comics are better for teaching about climate change than more traditional media. The present study aims at clarifying the role of comics in education for sustainable development and in particular climate change education. Our research provides evidence of an effective way to support learning in these areas. Several previous studies already point in that direction (see “Understanding comics”–“Retention after reading comics”; e.g. Topkaya (2016)) but are not thorough enough or are slightly off target for our specific question. The following sections will briefly summarize the most important studies concerning teaching with comics, before we present our own study.

Understanding comics

Wang et al. (2019) compared comics to illustrated text and infographics regarding understanding, retention, and general attractiveness with 38 university students aged 18–35. In their study, they tested their unique approach of “data comics”, which are diagrams organized in the typical layout of comics, using sequential panels as organizing elements. They found that this feature helped the students to understand content more efficiently than infographics or illustrated text. They also found that these special diagrams were the most engaging of the media in the participants’ subjective experience. Although this study does point in an interesting direction, Wang et al. (2019) stress the difference between traditional or science comics and their own “data comics”. Research results concerning comics “may or may not generalize to data comics” (Wang et al. 2019, p. 3). A key difference is the lack of protagonists in data comics. Protagonists are important drivers of stories and reveal perspectives in a more or less complex network of actors. In climate change education, this is particularly important when dealing with human–environmental interaction. Here, we can see how human activity changes natural processes and system behaviors which rebound to directly influence human behavior in adapting to the new environments. The threatening nature of these disruptive changes raises the need for informed actions to prevent, mitigate, or at least prepare for climate change and thus calls for a deep understanding of the underlying natural processes for decision making on a socio-political level. Climate change education in the sense of ESD requires us to look at social, political, environmental, and economical points of view simultaneously. Here, the focal point lies in the actions of stakeholders, who can be represented as protagonists in a comic story. Protagonists also support emotional storytelling and the portrayal of human interaction with the (natural) world, i.e. humans’ profound embeddedness in this world.

In another study, Topkaya and Dogan (2020), compared the academic achievements of 83 7th graders in a social studies course using a quasi-experimental research design. The experimental group received comics to learn about how organizations and individuals can fight climate change. They achieved significantly better results in the posttest than the control group, which was taught in a traditional way not using comics. However, the comics used were created with a somewhat limited online graphics tool, which offers preset elements to be composed on the computer. More importantly, it is unclear to us whether the difference in performance between the two groups is attributable exclusively to the use of comics or to other elements of the lesson plan as well, which differed between the two groups.

Hosler and Boomer (2011) present a study examining the effects of learning about biology in four different groups of college students comprising 14 to 38 students each. They were able to show that students learning with comics do not perform worse than those learning with regular textbook material regarding apprehension of content. However, they concede that comparability of their sample groups seems to be an issue; they call for experiments with a “more precisely controlled experimental design, specifically creating randomly selected groups that share similar demographics” (Hosler and Boomer 2011, p. 316). Furthermore, they raise the question as to whether we can generalize their results to other comics dealing with different subject matters. Thus, they state that testing other comics is “an essential next step” suggesting “a well-defined control that receives content instruction without a comic and a treatment group that receives the same instruction supplemented with an appropriate comic” (Hosler and Boomer 2011).

Retention after reading comics

Aleixo and Sumner (2017) recruited 90 participants aged 18–84 years (mean: 24.4, SD 11.17) from a university campus. Participants were assigned randomly to three groups, which were presented with a text, a comic, and a comic with incongruent (random) images, respectively. Immediately afterwards they tested the participants’ memorization of factual knowledge on psychobiology presented in the material with a set of multiple choice questions. Using ANOVA, they found that the memory scores of the comic group were significantly higher than both the text only group and the incongruent condition. While this study made important inquiries into short term memory, the researchers did not shed light on memorization over greater time spans, such as a week or a month. Aleixo and Sumner (2017, p. 87) recommend further studies with more participants and over a longer time period.

Hussein (2020) conducted an experiment with 61 female primary school students in social studies. After an initial test immediately following the treatment, which involved a comic about social issues, Hussein applied an identical follow-up test 2 weeks later. She found a significant difference in the mean scores, showing the advantage of learning with comics for retention of content.

Despite these encouraging results, we do not know whether they are valid for complex comics from the field of ESD. In this context, as well as in climate change education, learning material is often required to present information both on processes from the natural sciences and from human geography or social studies alike.

Environmental concern through comics

Several researchers found that teaching with comics may influence attitudes towards the environment (e.g. Maggiulli 2020; Richter et al. 2015; Topkaya 2016; Bozdogan 2011; Munawwaroh et al. 2018). In the context of managing invasive species for instance, Maggiulli (2020) uses comics to break with non-scientific perceptions of good and evil. She emphasizes techniques such as storytelling and narrative driving the students to engage deeply with the questions and problems at hand. This may foster attitudinal change (Maggiulli 2020).

Richter et al. (2015, p. 8862) used a comic whose title translates to “protect, because it’s richness” to conduct a study with 542 students from six primary schools in Madagascar. They state that environmental education can affect attitudes and behavior of students in positive ways. The use of the comic has indeed significantly improved learning outcomes of their participants compared to the control groups. However, the research instrument was designed to assess factual and conceptual knowledge only, but did not measure attitudes like environmental concern directly.

Topkaya (2016) report significantly higher scores on a scale determining environmental attitudes of students who learned about environmental issues with comics compared to students who did not learn with comics. It is questionable, however, whether the effect was solely attributable to the comics used in the teaching process, which lasted over 3 weeks. The teaching methods and materials used seemed to differ between the control and experimental groups. This makes it hard to compare between them.

In an experimental study with 89 university level preservice teachers, Bozdogan (2011) observed that learning with visual materials, including but not limited to comics, had a higher impact on changes of attitude than learners who were given lectures without visual materials. Qualifying this result is the fact that the differences apparently were not very strong: “A meaningful difference did not exist between the mean posttest scores of the two groups” (Bozdogan 2011, p. 229). More importantly, it remains unclear whether any differences between the groups should be attributed to the visual material or to their engagement in other learning activities, which were not provided to the control group.

Munawwaroh et al. (2018) also detect differences in attitudes among students who learned about climate change using comics in a quasi-experimental research design with 56 7th grade junior high school students. However, apparently the control group did not receive learning material that was as diverse as the comic presented to the experimental group. Therefore, any effect detected in posttest results may be attributed to the different content provided to the two groups.

Change of behavioral intentions through comics

Persuasive media can induce a willingness to change behavior, for instance to take climate change mitigative actions (Sinatra et al. 2012). The presence of perceived social relationships between audience and media persona can be a key factor in the success of this kind of communication (Park 2020). The display of humanlike gestures in learning material can be used to advance this relationship and positively influence learners to thoroughly investigate the material (Fiorella and Mayer 2021a).

Dobbins (2016) argues that comics can influence behavioral intentions in the field of medicine through their narratives particularly when mirroring the readers’ (social) spaces and bodies, because the characters in the comic can serve as a kind of role model. Kearns and Kearns (2020) explore comics’ potential to help change health behavior during the COVID pandemic through visuals, text, and storytelling. “Simplification, schematic representation, and metaphor can make abstract concepts tangible” (Kearns and Kearns 2020, p. 147).

In a very small-scale study, Katoppo et al. (2020) tested the effects of a comic about planting activities on fourth graders. They designed a comic to be close to the participants’ experiences. The authors used the school buildings and some of the children as references for locations and characters. They conclude that the comic lead to concrete action, i.e. planting activities in the school’s vicinity. However, there was no control group, which makes it hard to pinpoint the results to the use of the comic. The planting activity could have had a notable effect as well.

Theodorou et al. (2018) found that comics can have a great effect on learners’ willingness to change behavior regarding the protection of the climate. They report that 44.4% of 459 participating students from grades four through seven were willing to change their behavior after having worked with comics on climate change. They also observed a positive change in knowledge and attitude towards problems of climate change. However, the students produced their own comics, so the question remains whether the effect would have been as strong if consuming pre-existing comic stories as teaching material. Unfortunately, this study lacked a control group.

Zhang-Kennedy et al. (2016) studied behavioral change after 52 participants recruited from a university had read interactive online comics about computer security. They report that 80% of participants with weak passwords actually changed their passwords after reading the comics, moving from “’I should do that’ to ‘I did that.’” (Zhang-Kennedy et al. 2016, p. 235, emphasis in original). They observed behavioral change in four more fields; the most noteworthy change was perhaps in sharing information. Reading the comics led 69% of the participants to share the new information with friends and family “without prompting” (Zhang-Kennedy et al. 2016, p. 236). This study sheds light on the potential of comics to influence behavior, and it would be interesting to see whether this potential can be used in education about climate change.

Ario et al. (2020) distributed a comic-like pamphlet about waste disposal and found, in a survey conducted afterwards with 256 participants, that the information given had changed the waste disposal behavior of the recipients. Orderly waste disposal increased by almost 40% according to the survey. 99.2% of the participants found the comic material “effective” or “highly effective” in this respect. It is somewhat questionable whether the participants actually changed their behavior, since the result relies on self-reportage through questionnaires and not field observations of actual waste disposal behavior. Social desires to please or support the researchers cannot be completely ruled out as having played a role. However, this study seems to show the participants’ strong intention to change their behavior.

Comics for education about climate change

There are numerous examples of comics, which lend themselves to teaching the principles of sustainability. Some comic authors might use journalistic techniques to gain insights into the topics at hand, for instance, the anthology Maroni. Les gens du fleuve (Copin et al. 2022), which is a collection of comics about the encounters of comic artists from mainland France with the indigenous population of the overseas department of French Guyana. They reflect upon the relationships between society, politics, and the rainforest with a particular view of the individual. Others might use science fiction or imaginations featuring monsters or yet to be invented technology, such as Marvel’s take on Godzilla (Doug Moench et al. 2006; first released in 1977). A hazardous monster is released from the melting Alaskan ice threatening humanity, which can be read as a metaphor of natural hazards resulting from climate change. These and many more comics do have a place in education being masterful pieces of art relating to sustainability issues in fascinating ways. However, for this study, we are deeply interested in the opportunities comics offer to explicitly target education on climate change by incorporating specific media such as maps, charts, and diagrams stemming from more formal scientific contexts. Thus, we focus on science comics based on the definition by Tatalovic (2009, p. 4) as “educational science themed comics [which] may help to promote and explain science to students and the general public”. Considering the studies detailed above, we must now investigate learning with comics which are tailored exclusively for teaching about climate change. We need to inquire into the medium’s potential to communicate ESD and climate change education using their often unique multimodal capacities in storytelling, e.g. by integrating maps, charts, or diagrams. Do comics featuring these characteristics really improve learning about climate change better than other, more traditional media like text? We need to test comics that communicate in a geographical way through a thorough experimental research design. For this purpose, and based on the previous research detailed above, we hypothesize:

H1:

Learning with geographical comics about climate change is more effective than learning with text only.

  1. 1.

    Learning with comics helps learners to understand spatial causal relationships better than learning with text only.

  2. 2.

    Learning with comics supports retention of geographical knowledge better than learning with text only.

  3. 3.

    Learning with comics intensifies environmental concern more than learning with text only.

  4. 4.

    Learning with comics changes behavioral intentions based on geographical insights more effectively than learning with text only.

Methodology

In order to find out about the effects of learning with comics compared to learning with texts, we used a randomized pretest–posttest experimental design with two groups. We compared two independent samples using parametric and non-parametric methods. According to our hypotheses, which are based on previous studies for learning with comics, we opted for a one-tailed statistical testing at a 0.05-α-level.

Participants

We recruited 80 6th and 7th grade students from three different schools in North-Rhine Westphalia, Germany, to participate in our research study. They were aged 11–15 (Mean 12.30, SD 0.91). One of the schools was a “Gymnasium”, a type of high school preparing learners for university in the long term. The second school was located in a social hot spot, with very diverse students from different backgrounds. The third school was a private school using the Waldorf pedagogy. The choice of different school types ensured a good representation of the population. The 80 students were randomly assigned to one of two equally sized groups prior to carrying out the experiment. Out of the 80 students, 8 did not participate due to absence from class during the time of the experiment, reducing the number of actual participants to 72. The experimental group (n = 36) received the comic as learning material, while the control group (n = 36) received a more traditional learning medium, in this case a text with the same content as the comic.

Tested material

In order to purposefully test our hypotheses, we particularly designed the comic (Online Resource 1) used for this experiment to teach the interdisciplinary nature and complexity of climate change with its intricate interdependency of human and environmental interaction. The story of the five page comic starts in the city of Brasilia depicting indigenous persons in traditional attire protesting against the destruction of the rainforest against the backdrop of the hypermodern Brazilian government buildings designed by Oscar Niemeyer. Afterwards, the story migrates to the rainforest discussing the water cycle. In the end, we see a perspective from space in the form of a LANDSAT satellite image map. The view from space opens up a different perspective on the forest and its surroundings, which are often turned into crop fields. The comic depicts spatial dimensions and global interrelations of climate change, its causes, its implications, and its solutions, by means of multimodal visualization devices which were integrated directly into the story flow. The comic used in this experiment (see supplementary material) is one of ten chapters in a comic atlas, utilizing maps, charts, and diagrams which are an integral part of the comic stories (for more information see https://aboutusclimate.org/). The maps and infographics are woven into the visual language of the comics (Fig. 1). The excerpt used for this study is particularly suited for teaching climate change, e.g. in the geography classroom, touching upon the meaning of evapotranspiration for the conservation of the Amazon rainforest, the meaning of the forest for the livelihood of its local inhabitants, and political and consumer decisions on a global scale as part of the solution.

Fig. 1
figure 1

An excerpt of the five-page comic used in the study, showing the seamless integration of image, chart, and text (translated from German) elements into a single artefact (von Reumont and Knudsen 2020)

We carefully transformed the content of the comic story into purely textual form (Online Resource 2) for the control group, avoiding any loss of information in the process. A teacher revised the text, paraphrasing it to better comply with the age group. We reapplied these revisions to the textual parts of the comic to ensure maximum comparability. We confirmed the comic’s suitability for the age group prior to the experiment with a student from the target group who did not participate in the study.

Research instruments

Pre-, post-, and follow-up questionnaires constituted the research instruments. All questionnaires consisted of three main sections, to evaluate the three variables of understanding, environmental concern, and behavioral intentions (Table 1). For testing understanding of the content, we used nine open response questions, gradually rising in complexity, beginning with naming the location discussed in the material and ending with developing an argumentation from a certain point of view, reasoning about helping indigenous peoples to protect their home lands. Further questions touched upon evapotranspiration and its meaning for the water cycle in the Amazon rainforest, the forest’s meaning on local and global scales, what is threatening the forest, and the persons who are involved in the current conflict about the use of the land currently covered by rainforest. We assessed retention of knowledge in a follow-up test 1 week after the posttest with an identical set of questions. We added two questions at the beginning of the questionnaires for the posttest and follow-up test inquiring into general understanding of the story and retention of the content.

Table 1 Structure of the questionnaires used in this study (Items translated from German)

For testing environmental concern, we used and adapted a scale proposed by Cruz and Manata (2020), who identified it in a meta-analysis as the best option for measuring environmental concern. The original scale was developed by Schultz (2001, in Cruz and Manata 2020) and measures environmental concern on a Likert scale in the three dimensions of biospheric concern, egoistic concern, and social-altruistic concern. We deleted an item on marine life from the biospheric concern items, since marine life is not part of the content of the learning material. We added an item to the social-altruistic section asking for the participants’ concern about indigenous peoples, since they play an important role in the learning material. Additionally, we changed the prompt for each item from “I am concerned about environmental problems because of the consequences for…” to “I am concerned about climate change because of the consequences for…” (Table 1). We used a 4-point Likert-type scale ranging from “strongly agree” to “agree”, “disagree”, and “strongly disagree”. We did not provide a neutral option, as we wanted to encourage opinionated answers.

We measured behavioral intentions with eight items on the same four-point scale we used for the items measuring environmental concern. In a first set of three questions, we assessed structural behavioral intentions, asking about the participants’ political opinions. The second set of questions in this section targeted individual behavioral intentions. Following Park (2020), here we added two response options (I already do this; Does not apply to me), which were treated as missing data. The items were worded following Sinatra et al.’s (2012) suggestion of using unambiguous, declarative statements free of jargon. According to Gibbons (no date, 6), “one way to increase predictive power of intentions is to make them more concrete”, so we were careful to use wording describing specific forms of behavior. In the prompt for each item, we used the term “I’m willing to…”, as suggested, for example, by Hermans and Korhonen (2017).

The posttest questionnaire additionally contained demographic questions and questions concerning comic reading habits and the learning experience with the material used in the study. We had validated the pretest questionnaire prior to the experiment with a student from the target group who did not participate in the study itself, to check for level of difficulty and time spent on the questions.

Procedure

The experiment was conducted at the schools between November 2021 and January 2022 for the period of one lesson, which usually lasts 45 min. In those cases, where a lesson was timetabled to last longer than 45 min, the time difference was evened out by allocating more time to the personal introduction of the instructor and to the concluding part of the lesson, taking place after the experiment, without more information or more detailed instructions being given. Prior to the experiment, we had retrieved informed written consent from the participants and their parents to take part in this experiment. The participants’ data were anonymized using individual codes known only to the participants themselves, which stayed identical throughout all three test phases. After a short greeting the instructor explained the goal of the study and the procedure of the experiment. He handed out the pretest questionnaires, which the students completed without further assistance. The instructor collected the questionnaires and asked the students to gather at one end of the classroom. He then called each student by name and handed out the material according to the previously randomized list, allocating them to one of the two groups, together with the posttest questionnaire. This procedure meant a small break for the participants and a moment to relax after having taken the pretest. The instructor assigned students to seats on either side of the room so that the experimental and the control group were spatially separated to ensure independence of the groups. When everybody was seated the students started to work with the material and filled out the questionnaires without any further assistance. At the end, the students turned in the questionnaires, together with the learning material. One week later, we conducted a follow-up test.

This procedure ensured that no external influences could be made accountable for differences between the two groups in the pre- and posttests, because no time was spent outside the controlled space of the classroom between reading and answering the questions. There was no change of teaching method or personnel, and very limited instructions, so that all participants were facing the same conditions. The follow-up test is not as reliable, as the possibility of informal exchange among the students existed. However, we instructed the school teachers not to discuss the material or related topics during the week leading up to the follow-up test.

We did, however, observe a fairly high degree of carelessness and insufficient effort while answering the questions. Students repeatedly interrupted their work or asked questions concerning grades. Although sanctions or rewards can effectively curb this kind of behavior (Gibson and Bowling 2020), for ethical reasons we did not apply any, because, after all, participation was voluntary.

Closure

After the completion of the follow-up test in the lesson a week after the experiment, the instructor handed out the comics to all participants. He conducted a lesson using the comic for an intensive discussion of the material and its content.

Assessment and rating

For the assessment of the open response questions, two reviewers independently rated any information presented by the participants in the questionnaires. To earn points, answers had to include at least one piece of information from a list of 36 items that had been established beforehand according to the information presented in the material. Each correct item earned a participant one point. Inter-rater reliability was determined for the posttest using Cohen’s κ and showed substantial agreement between the raters (κ = 0.701, p < 0.001; 88.9% agreement). This indicates a high reliability of the testing instrument. To reach perfect agreement on all items, the reviewers compared the lists, and where the results did not match they discussed the topic until they reached an equal judgement.

For the assessment of the Likert scales for environmental concern and behavioral intentions, we coded the options from 4 (strongly agree) to 1 (strongly disagree) and entered them into an Excel sheet. To eliminate careless mistakes, we compared the results of the two raters and consulted the questionnaires where we found disagreement.

Preparation of data/statistical analysis

After observing the situations in the classrooms, it became clear that careless or insufficient effort responses (C/IER) were an issue while collecting the data. C/IER occurs in every survey and can be due to inattentiveness, fatigue, or other factors (Hong et al. 2019, 313). It clouds actual individuals’ capabilities, which can differ significantly from what was measured. It reduces reliability and increases error in the experiment (Ward and Meade 2023, 13.7). Curran (2016) strongly recommends the removal of such data following collection. “The removal of these invalid responders has been shown to reduce error and provide more valid results” (Curran 2016, p. 5). However, several researchers have reported that up to 50% of the data can be affected (Curran 2016; Ward and Meade 2023, 13.6).

In situations like in this experimental setup, with a lengthy questionnaire and no consequences for careless responding, C/IER is very likely to occur, especially when participants feel that completion is mandatory and without rewards (Ward and Meade 2023, 13.6). Our participants were facing a quasi-test situation, but the test was not graded. Additionally, some researchers point out that writing is strenuous for children in that age group (Thürmann et al. 2015, 30f.) Participants lacking self-control are particularly prone to C/IER (Ward and Meade 2023, 13.7), which might apply especially to our rather young learners. C/IER can be mediated to a certain degree by proctoring and partly by appeals to the participants to answer responsibly (Ward and Meade 2023, 13.15), which we did during our time in the classrooms.

To eliminate C/IER we applied several measures. In the self-report surveys of environmental concern and behavioral intentions, we screened the questionnaires for visual patterns, as described by Ulitzsch et al. (2022). However, flagging straightlining was not applicable, because of the nature of our scales (see “Research instruments”, Table 1). It might not be uncommon to agree at the same level on different dimensions of concern. We removed one case with a diagonal pattern. Several authors recommend an outlier analysis using Mahalanobis distances (Ward and Meade 2023; Curran 2016). In this analysis, all items are considered for measuring outliers in a multivariate space. After applying Mahalanobis distances, we removed an additional case. Finally, for assessing C/IER in the content comprehension section of the questionnaires, we established a threshold below which we considered all cases as insufficient for our comparative study. It does not seem to make sense to look for differences within missing or invalid answers. Furthermore, we cannot assume that the consumption of certain media types influences concern or behavioral intentions if their content was not understood sufficiently well. Consequently, only when more than 20% percent of all items (> 7 out of the 36 items) were solved was the person included in the sample. After removing C/IER cases, our sample size was 36 (ncomic = 15, ntext = 21). We used this refined sample for all further statistical analysis.

Before statistical analysis, we transformed the raw scores for understanding, concern, and behavioral intentions from the nominal scale into values on a ratio scale using the Rasch model. Rasch analysis takes into account test item difficulty and uses it to weight test scores for each item. Participants who answered hard items (items which were answered by a small number of peers) get a higher rating for these answers. Comparing item difficulty can help detect differential item functioning (DIF) between different groups of participants (see Boone and Staver 2020, p. 173). If item difficulty stays the same for each test and for each group, the items clearly define their traits and we can assume that there is no bias. In this case, better test results are attributable to better performance by the participants, instead of weird item behavior. This ensures the validity of the test instrument. Following Luppescu (1991), we used graphical diagnosis for detecting possible DIF (Fig. 2). We compared item difficulty in the pretest from the control group with item difficulty in the pretest from the experimental group using control lines for a 99% confidence interval. All items fall into the corridor between the control lines, showing no DIF (Fig. 2a). The same is true for our plots of the control group’s pretest against their posttest (Fig. 2b), as well as the experimental group’s pretest against their posttest (Fig. 2c). In the plot of the control group’s posttest against the experimental group’s posttest (Fig. 2d), one item is flagged for DIF: this item has shifted, for the text group, towards a significantly easier value. Since the item was flagged only in one of the four plots and it is not very far from the confidence interval, we decided not to eliminate it from the analysis. We draw the general conclusion that our test instrument was reliable in all situations in the sense that items consistently defined the corresponding traits, with one negligible exception.

Fig. 2
figure 2

Plotting differential item functioning within a confidence interval of 995 reveals bias and other item functionalities which might distort test results

Once correct item functionality is confirmed, we can proceed to make the participants’ pre- and posttest performance comparable. In the Rasch model, the final test result for each participant is expressed in logits, assuming a value usually between − 3 and + 3 on a ratio scale. To put the logits for the pre- and the posttest into the same frame of reference, following Linacre (2011), we fixed the values for item difficulty at the posttest level and used them to evaluate performance in the pre-, post-, and follow-up tests alike, instead of recalculating new item difficulties for each test stage. Using the anchored Rasch logits, we can now reliably calculate the knowledge gained between the pre- and posttests by subtracting the Rasch pretest values from the Rasch posttest values. To measure retention we subtracted the follow-up measurements from the posttest measurements.

For further investigations into differences in the way the two groups understood the content, we analyzed linguistic expressions used in answering the open response questions. In this rather qualitative approach, we categorized each statement according to its focus on human stakeholders’ activities. An example for the human centered category is: “The [indigenous peoples] want to protect their rainforest.” An example for the non-human centered category is: “[The story is about] the South American rainforest.”

For the Likert scales assessing environmental concern and behavioral intentions, Rasch analysis identifies the easy to endorse items in comparison to those that are rather hard to agree with. This results in better comparability between participants and more reliable results of overall rating, because the Rasch model numerically defines the distance between the points on the Likert scale (e.g. between “agree” and “disagree”) on a ratio scale. The results of the participants’ ratings are expressed in logits, which we used for any further calculations and statistical analysis. For reasons of comparison we anchored the item difficulty at the pretest level to detect changes between the times of data collection (pre-, post-, and follow up tests).

For the Rasch analysis, we used the Winsteps software (v4.5.0.0). For calculation of effect sizes and test power, we used GPower (v3.1). For all further statistical analyses, we used SPSS (v28).

Results and discussion

The results of the statistical analysis are summarized in Table 2. For each of the four variables of understanding, retention, environmental concern, and behavioral intentions, we compared the means of the control group and the experimental group. The experimental group received a comic to learn about aspects of climate change. The control group received a text with the same content. With the comparison, we want to answer the question of whether there are advantages in learning with comics according to our hypothesis.

Table 2 Summary of the results of the statistical analysis of understanding content and retaining it, and changes in environmental concern and behavioral intentions after the treatment

Understanding

A t-test for two independent samples was performed to compare understanding of content in the comic and text groups. There was a significant difference in content comprehension between the group learning with comics (M = 3.421, SD 1.632) and the group learning with text (M = 2.172, SD 1.219); t(34) = 2.631, p = 0.006). We rejected the null hypothesis that the means of the two populations were equal or differed just by chance. Our results show that the group learning with comics gained significantly more knowledge (57.5% on average) than the group learning with text. Due to the relatively small sample size and the not quite equal group size, prior to performing the t test, we performed the Shapiro–Wilk test (W) for non-normality. The null hypothesis is that the sample is not normally distributed. The test did not show evidence of non-normality for the distribution of our measurements of knowledge gain in the participants (W = 0.966, p = 0.321). Based on this outcome and after visual analysis of the stem leaf and QQ plots, we rejected the null hypothesis and decided to run the t test with our data, which is a parametric test useful for normally distributed samples. We also tested the data for homogeneity of variances (Levene test: F = 2.658, p = 0.112), which allowed us to assume that the variances of the two samples are equal (the null hypothesis of equal variances was not rejected). For our variable of understanding, the post hoc power analysis revealed a value of 1 − β = 0.807 for a calculated effect size of Cohen’s d = 0.867 in a one-tailed design with an α-level of 0.05. Although we know about the controversy of post hoc power analyses, we nevertheless wanted to appraise the post hoc situation with a substantially reduced sample size after removing careless and insufficient effort responses (“Preparation of data/statistical analysis”).

Analysis of differences in item difficulty from pre- to posttest

For a deeper understanding of the learning process, we were interested in finding out exactly where the strengths of learning with comics lie. Looking at the results of our Rasch analysis, we observed several differences in the participants’ improvement by sub-topics. Rasch analysis allows the assessment of item difficulty. For comparing knowledge gain, we used the posttest item difficulty as a frame of reference to rate both pre- and posttest. The difference between the anchored item difficulty and the item difficulty had it not been anchored (“displacement” in Rasch terminology) tells us how much easier items have become after the intervention. Figure 3 shows the 12 items in which participants have improved the most after the treatment. We can see that, after reading the comic, it was much easier for the learners to solve 6 items related to stakeholders, e.g. to identify big companies as a threat to the forest (“What threatens the rainforest?”). The comic has further helped students to improve in tasks where a global perspective was important (“What is the meaning of the rainforest to the people worldwide?”) and to understand simple argumentations concerning the role of the rain forest for indigenous peoples (“Why should we help the people living within the rainforest?”). In contrast, using text for learning helped the students most efficiently to solve seven items related to the natural sciences, e.g. to explain evapotranspiration. Furthermore, the text helped the learners to name Brazil most efficiently, as referred to in the material, and to explain processes in terms of natural science, like the water cycle.

Fig. 3
figure 3

Characteristics of improvement in the experimental (blue) and the control (orange) groups considering the most improved (30%) items

Per-item analysis of performance between groups

These findings are corroborated by the comparison between the two groups based on the posttest performance for each single item. Table 3 shows the significant differences for items in direct comparison between the groups using the χ2 (chi square) test. Here, we can see that the participants in the comic group have consistently outperformed the participants in the text group when focusing on stakeholders and global perspectives. Additionally, we can see that the comic readers were significantly better at answering an item concerning the water cycle, relating to the natural sciences. This shows that although the text was most efficient in conveying content related to the natural sciences as compared to other topics, this does not mean that it was necessarily more efficient than the comic in doing so. This, together with the explanation of evapotranspiration, were the two science related items in which the learners reading the comic improved most. The text group performed better in naming fire as a threat to the rainforests which was not related to stakeholders.

Table 3 Results of the χ2 analysis of the variables medium (comic, text) and performance per item (0, 1) showing only the significant results

Analysis of linguistic expressions used by the two groups

We wanted to further investigate the differences between the two groups in their ways of relating to the content by analyzing their linguistic expressions in the open answer questions. We found that the extent to which they focus on human stakeholders is highly dependent on the medium used for learning (χ2(1) = 9.468, p = 0.002). The participants who learned with comics tend to express themselves using a more active instead of passive voice and to put human action into the focus of their answers. They also tend to relate to non-human actors in a human-like way, e.g. stating that “the forest breathes in O2 and breathes out CO2”, an answer based on a phrase found in both materials. In comparison, the participants learning with text expressed themselves using more impersonal language, in the passive voice, without naming stakeholders.

Retention

The results presented in this section are based on the questionnaires we handed out one week after the posttest. Conducting a t test on the data showed a significant difference in content comprehension between learning with comics (M = 0.578, SD 0.565) and learning with text (M = 0.196, SD 0.565); t(28) = 1.76, p = 0.045). We rejected the null hypothesis that the means of the two populations were equal or differed just by chance. Comparing the means, we can see that the comic group retained 50.1% more content than the text group. We see this as a slight hint of better memorization of content using comics compared to text, although the differences are not as pronounced as for the variable of understanding. We justify the one-tailed t test, because it does not seem to make much sense to assume that a better understanding of a media type should result in worsened capacities for memorizing its content. Indeed, we found a medium-sized correlation between understanding and retention (Pearson’s r = 0.417, p = 0.022, two-tailed α-level of 0.05, n = 30). The results of the Shapiro–Wilk test (W) did not confirm the null hypothesis of non-normality. The Levene test showed no evidence that the null hypothesis of equal variances could be rejected (F = 0.073, p = 0.789). Thus, assuming normality and homogeneity of variances, for our variable of retention we chose the t test for a comparison of the means of our independent samples. The effect size (Cohen’s d = 0.644) is between medium and large, with a test power of 0.524 according to the post hoc analysis. This loss in effect size and test power compared to the variable of understanding may be due to a further reduction of participants in the follow-up test, which took place a week after the experiment. Many participants were absent from class because of sick leave.

Environmental concern

The Shapiro–Wilk test showed evidence of non-normality for the distribution of our measurements of growth of environmental concern in the participants (W = 0.928, p = 0.043), which is why we used the one-tailed non-parametric Mann–Whitney U test for the comparison of the distribution of the two groups. It shows a significant difference (U = 83.500, p = 0.132, one-tailed) between the group using comics and the one using text for learning (comics n = 13, Mean Rank 17.58; text: n = 17, Mean Rank 13.92). We could not reject the null hypothesis that the distributions of both populations are identical. However, we did detect a moderate correlation between understanding and environmental concern (Pearson’s r = 0.400, p = 0.029, n = 30, with a two-sided α-level of 0.05), showing that the more the learners understood of the content the more concerned they were about the environment in general, The coefficient of determination (R2) tells us that understanding and environmental concern share 16% of their variability, meaning that other variables are responsible for the remaining 84%.

Behavioral intentions

In our experiment, we were unable to reproduce the results observed in other studies in terms of the comic’s influence on the participants’ willingness to change their behavior for the protection of our climate (behavioral intentions). The statistical analysis did not indicate significant differences between the groups (comic: n = 13, M = − 0.039, SD 0.610; text: n = 20, M = − 0.031, SD 0.67; t(31) = 0.305, p = 0.381). We decided to use the t-test, assuming normality of the distribution (Shapiro–Wilk test: W = 0.968 p = 0.437) and homogeneity in variance (Levene test: F = 0.179, p = 0.875). We could not reject the null hypothesis that the distributions of the two populations were equal.

Summary and general discussion of results

We detected significant advantages of learning with comics as compared to learning with more traditional media like texts. Most pronounced were the advantages in the case of understanding content. We found a very large effect size (Cohen’s d = 0.867), which makes the outcomes of our statistical analysis quite plausible, despite a limited sample size. We ensured that confounding variables were excluded as best as possible by letting pretest, intervention, and posttest immediately follow each other. In the context of previous studies detailed above, we interpret this as good evidence that learning about climate change can be much more effective with comics than with traditional media like plain text. One participant mentioned in the discussion after data collection, that it was very easy to find certain pieces of information again, because the visual structure of the comic’s panels and content being matched to images assisted searching. The combination of text and picture can offer more ways to construct a mental model than just one medium alone (Schnotz 2021).

There are also hints at comics improving retention. Although the effect size was not as pronounced as in the case of understanding the content, it is fairly safe to assume that a better understanding will help retaining content (Pearson’s r = 0.417, p = 0.022, n = 30, with a two-sided α-level of 0.05). This moderate correlation leaves room for the assumption that other factors played a role as well. Apparently, the combination of text and picture does not only support working memory (Schnotz 2021), but also affects long-term memory.

Environmental concern was positively influenced by the level of understanding the content. Since our comic facilitated understanding the content, indirectly it should have positive influence on environmental concern as well. However, we did not detect any direct statistical evidence of this. As the moderate correlation implies, other factors may influence environmental concern more than a single read of a comic. These may include previously acquired attitudes and values, and individual socialization.

Since there was no pronounced increase in environmental concern, we are not surprised that we could not find evidence of the comic having influenced behavioral intentions.

The success of learning with comics is often attributed to children having fun reading comics. However, in our sample, there was no difference between the two groups regarding the fun they had working with the material (two sided 0.05-α-level Mann–Whitney’s U = 75, p = 0.667). This is why we cannot attribute advantages in learning with comics to a possible motivational effect stemming from fun working with comics. This is plausible, because the story makes up a great part of the reading pleasure. If readers find the story boring, it will not help that it is presented as a comic. Neither could we detect a particular inclination towards comics among the young participants. We could not detect any correlation between understanding and comic reading habits (rτ = 0.052, p = 0.729), nor between understanding and fun working with the material (rτ = − 0.030, p = 0.849), using Kendall’s τ with a two-sided α-level of 0.05. The same is true for retention and comic reading habits (rτ = 0.146, p = 0.391), and retention and fun working with the material (rτ = − 0.069, p = 0.697), as well as for environmental concern and comic reading habits (rτ = − 0.143, p = 0.380), and environmental concern and fun working with the material (rτ = 0.100, p = 0.560). This suggests that we should rule out the presumed affection of children for comics as a factor in learning with them. Even if a child is not particularly fond of comics, they can still profit from the comics form. The advantages of learning with comics lie in their support of the learners’ cognition and not so much in their motivational drive to engage with the material.

Limitations

Due to the unforeseen impact of careless or insufficient effort responding (C/IER), our sample size was substantially smaller than we had anticipated. We might have run the risk of removing data that were not affected by carelessness. A similar study with more participants would probably reduce the effect of C/IER. We would have needed a much larger initial sample size to increase test power in the statistical analysis. Furthermore, we conducted one round of data acquisition at a fairly late time in the participants’ school day. This may have negatively affect their motivation or ability to concentrate and produce meaningful results.

The use of additional material in the form of illustrated text as a third medium in between comic and text might have resulted in a more nuanced research outcome. We were only able to test a single comic targeted on specific topics. Maybe other comics would have shown different results.

In future research, a thorough investigation of teaching methods with a focus on reflection could be insightful. Lux and Budke (2023) developed such a model for video games which might be applied to comics. It comprises the reflection of content, system, the self, and the medium. Further research may also concern digital comics. They allow for more interaction with the material than the analog print outs used in this experiment. Multimedia content such as audio or animation can be integrated in digital comics as well. Longer form comics would be interesting to research as well, since they offer the opportunity to show more and diverse locations. The present study focuses on only one of many ways to visually communicate issues of sustainability. Other forms of visual communication should be considered in future research.

Conclusion

We were able to show in an experimental research design that learning with comics can be much more effective than learning with more traditional media such as text. This is in line with previous studies concerned with comics and other forms of multimedia learning (e.g. Wang et al. 2019; see “Introduction”). The learners using a comic had a significantly better understanding of the content than those learning with text. Perhaps not surprisingly, this effect still shows a week later. Here, comic users prove to have better retention of the content than text users.

The effect on environmental concern is not so clear. Although we cannot register a statistically significant increase of concern when learning with comics, a better understanding of the problem at hand positively influences environmental concern.

We could not confirm any differences in behavioral intentions, in contrast to some previous studies. As Hornsey et al. (2016, p. 624) state, the positive link between the acceptance of climate change and behavioral intention to mitigate it is “intuitive, but only small to medium in size”. Presumably, there are many more influences from outside the experiment which play a greater role in forming resolutions, such as family background (e.g. one participant was born into a proud family of butchers, so reducing meat consumption was out of the question), consumer preferences, or general political beliefs. A comic seems to be just one piece of the puzzle here.

Motivational effects like being fond of comics did not influence learning outcomes in understanding, retention, environmental concern, or behavioral intentions. Ruling this out, positive effects probably result from the comic’s support of cognitive processes, as described, e.g. by Fiorella and Mayer (2021b) and Schnotz (2021) for other non-comic-related multimedia learning material supporting text-image-combinations and the spatial contiguity principal, which states that text and image should be situated spatially close together in the page layout to have a positive effect on cognitive performance.

However, we do not yet know to what extent we can transfer our findings to comics treating other topics. Furthermore, it would be interesting to investigate the role of images in improved retention. The comic was equipped with special techniques to guide students through the maps that were used in it. In further research, it would be interesting to see whether these specific techniques made a difference in map reading capabilities. In an eye tracking experiment, we found that the map does play a crucial role in understanding geographical content in comics (von Reumont and Budke 2020). A more visual rather than text-based testing instrument would be required to assess this, e.g. a test item that demands marking phenomena on a map.

In summary, we strongly recommend the use of comics for teaching about issues related to climate change and its impacts on society. In comparison to traditional media such as text, comics can offer a more effective way of understanding, retaining, and empathic reading of content.

We propose to use science comics to teach about climate change, because they seem to be able to represent complex environments in their combination of text and image. They also support multimodal storytelling, being able to seamlessly incorporate charts, diagrams, and maps, as well as other relevant media. Our study shows that a big advantage of learning with comics lies in their capacity to illuminate stakeholders’ actions and reasoning in a somewhat holistic account of the interplay between humans and their environment on both local and global scales. Textbooks most often neglect the perspectives of stakeholders in their goal of recounting “neutral” facts (e.g. for German high school textbooks, see Lütje and Budke (2022)). Comics can effectively fuse descriptions from social and natural sciences, making them particularly suitable for climate change education in the sense of ESD. In contrast, a particular strength of text as a medium seems to be in conveying science-related content. Perhaps, we can foster the best learning effects with a combination of media, for instance by using both texts and comics at the same time. Our lessons in the aftermath of the experiment demonstrated that non-quantifiable aspects of learning with comics are invaluable as well. We definitely recommend guiding learners through the material to get the most out of the comics. Guiding questions concerning image composition, the disposition of characters, or contrasts between text and image are just some of the meaningful forms of expression that we can observe and use in comics.