Working with texts in deliberate ways is a foundation for supporting students’ subject learning, and their potential to demonstrate their knowledge in the subject. Therefore, it is central for all teachers to understand how texts work, and at the same time to be able to support their students’ competencies regarding how texts are designed and how to make meaning through texts. Thus, teachers in all school subjects have a responsibility in this regard, and teachers of the mother tongue cannot be expected to be the only ones responsible for developing students’ text competencies. Instead, teachers in all subjects need to deepen their own text competence. We hope that this book can be a support in doing so.

Since text is the foundation of this book, and since this concept is used in both everyday language and academic language, this will be our point of departure in the following.

An everyday understanding of text could perhaps be formulated as “written words on a paper or presented on a screen”. Academic definitions of text include such connotations, but research over the last decades has contributed to a more nuanced understanding of what a text is and how texts work. Therefore, the text concept has been extended in relation to “written words on paper”, now including other resources for meaning-making, such as image, speech, gesture, and so on. Each one of these resources can be seen as a text, provided that it corresponds to other aspects of text, such as consisting of a delimited message that can be understood in a specific context. Thus, in research and education today, we often deal with what is commonly termed an extended text concept.

An extended text concept implies that texts can consist of a variety of communicative resources that form a joint entity. Examples are written wordsFootnote 1 combined with various forms of illustrations, or a spoken text combined with gestures. Such texts are multimodal, and they are the focus of this book. We will discuss the term multimodal more thoroughly below.

Since “text” today can involve so much more than just written words, it is important always to make clear how the text concept is used. Here we want to point out that there is no one correct definition of text. Instead, depending on the context or purpose for using the term “text” in a specific situation, the definition can vary.

In this book, we restrict ourselves mainly to analyzing and discussing paper-based or digital visual texts, including all resources integrated in these (such as images, animations, or audio clips). Also, our focus is on pedagogic texts, that is, texts intended for more or less formal learning situations such as textbooks or Internet resources produced for educational purposes. Of course, texts that have not primarily been produced for educational purposes are also used in schools. We claim that our discussions of multimodal texts, as well as our model for working with such texts, function well for any kind of text. However, there are good reasons to base an exposition like this on texts that are typically used in education. Teachers’ and students’ oral texts will only be commented on briefly, not because we are not interested in them, or because we find them unimportant, but for delimitation reasons. We will also look more closely at a couple of texts made by students, in order to discuss multimodal aspects of students’ text production.

The aim of this book is to provide the reader with an analytical framework for multimodal texts, and to suggest ways of working with multimodal texts in the classroom practice. In Sect. 4.5, Model for working with multimodal texts in education, we present a model for working “hands-on” in classroom practices. Based on a number of examples from real texts, we guide the reader through the model, suggesting ways of approaching texts to be better able to note how the texts are constructed, and what might need to be specifically focused on in the classroom. In the second part of the book, the model is implemented on a number of texts from different subject areas and intended age groups as well as from different countries. In relation to these text analyses we also discuss classroom implications.

The selection of texts in the second part of the book has been made with the intent to use the model for a variety of school subjects, and at the same time the texts have been taken from different countries. Apart from Sweden, which is where we live and work, text examples have been taken from other countries, such as Portugal, Singapore, and Chile. The text examples are taken from different subject areas and intended age groups, even though most of the texts are produced for students in upper elementary (grade 4–6) and lower secondary school. Factual texts geared to younger students are usually quite similar to those produced for grade 4–6 students, even though the amount of written verbal text is smaller, at the same time as the illustrations can be less complicated. Therefore, the examples chosen can also serve well for teachers working with students in lower grades. Correspondingly, texts for secondary school students can be seen as parallel to texts used in upper secondary school, even though texts for the later grades will be more demanding.

The majority of the texts are printed textbooks that are relatively widely used in the respective countries. For example, the Singaporean textbooks are obligatory in Singapore, and the Swedish textbooks are books that are relatively widely spread in Sweden, where each school decides what textbooks to use. The choice to focus on printed media is connected to the fact that printed books are still important in many schools. Also, many of the digitally produced textbooks today are still mainly based on printed versions of the texts, though in the near future we will probably see changes in this regard.

It is important to stress that the kind of close reading we perform in relation to our analyses is not done to draw attention to limitations or negative sides of the texts. Instead, our intention is to point out challenges that can arise in relation to any multimodal text, something which also justifies conscious ways of working with texts in all school subjects.

1 Multimodal Texts

As mentioned, an extended text concept is widely used today, and more and more researchers and teachers are embracing such a text concept, understanding that all texts are multimodal, that is, they are composed of a variety of sign systems, or semiotic resources, such as words, diagrams, graphs, photographs, or various kinds of symbols (e.g. Kress and van Leeuwen 2006, also see Berge 2001, for a discussion on the text concept). It is also common to use various kinds of graphically marked text objects, such as headings or ingresses. Web pages, magazines and the daily press are filled with such resources. Particularly in educational texts, there is an abundance of graphically marked text objects for different purposes, such as text boxes and bullet lists.

Thus, a number of features make texts multimodal, not only the inclusion of images. The layout and the ways in which different parts of the text carry meaning, is also important to understand. Potentially, any part of a text can carry meaning in specific ways. Sometimes the information across different text elements is contradictory, sometimes the elements complement each other. At times verbal text (i.e. the written words) is the most important semiotic mode, sometimes the image. Each page thus offers different challenges for text interpretation, not least for students who are expected to make meaning of complex content from the text.

It is quite common that students in lower grades are encouraged to express themselves in both images and words. A number of studies, however, have revealed that it is unusual for teachers to comment on what is expressed in these images. This is true of narrative texts as well as more descriptive, factual texts. Usually the teachers regard them as some kind of ornament, focusing on aesthetic aspects of the image at a quite basic level, such as commenting on the image as, for instance, “nice”, even when the images indeed carry specific and important meanings (e.g. Peled-Elhanan 2015). The use of such general comments might be explained by the fact that many teachers experience a lack of tools for giving constructive feedback in regard to other aspects of the images, and for multimodal aspects of texts. We hope that this book will provide such tools. Since our discussions and analyses are based on the relations between text design (e.g. Kress 2010) and content, aesthetic aspects will be disregarded in the following.

2 A Historic Perspective on Multimodality

During different historical eras, people have used various resources in their communication. In ancient Greece and Rome, speech was the predominant resource for communication. Rhetoric, which was the analytical and educational instrument of that time, highlighted that the speaker needed to make the right choices when addressing the listener, and to shape their speeches to suit the specific purpose. Also, according to rhetoric, the speaker could utilize a number of other resources than speech, such as more or less formalized gestures to underline certain aspects of the speech, variations in tempo and intonation, or placement in the room.

When people started to use written language in different contexts, the function was to keep account of money, to make new laws, to describe historical events, or to celebrate a certain person, for example a Roman emperor who had won a battle, or a Nordic viking who had traveled far away and gained a fortune. During the Middle Ages, a system for large-scale copying was developed to copy handwritten biblical texts. Here, verbal text was combined with different forms of illustrations. In addition, the interior of the medieval church is an example of a multimodal text, where biblical stories were illustrated on the ceiling, on the pulpit, or as carved images on the baptismal font.

The development of the letter press led to the possibility of mass distribution of texts. When a market for books and papers arose, information could be stored and spread to a great many more people than had previously been possible. This situation also challenged different kinds of power relations in society. The printed word also repositioned the collective memory, and when public schools were introduced, the printed word had a special status concerning learning. To be knowledgeable was more or less equivalent to being able to memorize printed words.

Well into the twentieth century, school textbooks in Western society were based on writing, and they were quite often expressed in a narrative style. These written texts were sometimes illustrated with fact-based images or reproductions of paintings. From the middle of the last century, however, we have seen a change in the relation between writing and image, and image has become more salient and important in pedagogic texts (e.g. Bezemer and Kress 2008).

Thanks to the breakthrough of digital media at the start of the new millennium, various kinds of educational games and applications for digital learning have come into play in education, at the same time as mobile phones and tablets can be used in more advanced ways (Arnseth et al. 2018; Devlin 2011; Egenfeldt-Nielsen et al. 2011; Lindstrand et al. 2016; Selander 2008; Squire 2011; Steinkuehler et al. 2012).

To sum up, writing and image have been used in different ways through history, and different communicative resources have played different roles in different kinds of texts. Today, a plethora of media are used in text production, and texts generally contain a variety of semiotic resources. Hence, it is important to pay attention to what kind of information is given through speech, writing, or image—or to put in another way: in what medium and through what semiotic resources is content expressed? Not least from an educational perspective, it is crucial to be able to discern how information is expressed and to realize how students are given opportunities to use different media and semiotic resources, in order to learn and to show their knowledge (Insulander et al. 2017). As a consequence, the forms for assessing and testing knowledge in educational contexts need to be developed in ways that allow for more resources than the written or spoken word.