1 Introduction

In the 1990s, the New London Group introduced the concept of multimodality as a theoretical perspective. This perspective suggests that contemporary texts convey meaning through different modes or modalities, which are socially, politically, and culturally shaped resources for meaning-making (Jewitt & Kress, 2003; Kress, 2010; Serafini, 2015). Broadly speaking, the New London Group suggested five modes of communication for meaning-making—linguistic, visual, spatial, aural, and gestural. These modes can be represented in digital or paper formats or live performances (Kress, 2010). Children use different modes to make meaning when they meet and interact with texts such as picture books, memes, digital apps, recipe cards, persuasive digital advertisements, and live puppet shows. Since the landmark work of the New London Group, research has studied how modes work to create meaning, along with their associated resources, for example, shot distance and angles (Callow, 2020), digital image and audio (Zammit, 2015), colour, types of lines, and framing (Pantaleo, 2021), and music, mood, image, and language (Barton, 2021; Barton & Unsworth, 2014). What is clear is that conveying meaning is no longer limited to written language; it spans disciplines such as art and film and design and technology (Serafini, 2022, 2023).

The texts that children engage with and that are part of their everyday home, school, and community lifeworlds are complex, as are the digital media technology used to create them, particularly when compared with paper-based written texts. Technology aids the integration of different modes to communicate a message, idea, feeling, or perspective. It enables the design and creation of a multimodal text. An example is the movie trailer for the prequel movie Wonka, which explores the origins of Roald Dahl’s beloved character, Willy Wonka (Warner Bros. Pictures, 2023). This multimodal text allows the exploration of modes and their associated resources, including, but not limited to, colour, lighting, symbolism, sound effects, music, facial expressions, and body movement. It also allows the exploration of a metalanguage to describe and explain different meanings across text, image, space, body, sound, and speech. The development of a metalanguage was first formalised by Kress and van Leeuwen (2006) in their grammar of visual design. A grammar, or metalanguage, has been applied to various types of texts, including children’s picture books (Callow, 2020; Serafini & Reid, 2022), video games (Lowien, 2016; Wildfeuer & Stamenković, 2022), and films (Barton et al., 2021; Noad & Barton, 2020).

A formal metalanguage associated with different meaning-making modes is especially pertinent to us as academics specialising in English, language, and literacies in Initial Teacher Education (ITE) in Australia. The ITE curricula at our respective tertiary institutions focus on evidence-based theory and practice around working with texts and their different meaning-making modes. In our lecture and tutorial work with pre-service teachers, we use a variety of texts such as picture books, information texts, readers’ theatre, poems, movie trailers, memes, influencer Instagram posts, and digital advertisements to explicitly model and teach a formal metalanguage to describe and analyse how modes and their associated resources work to create meaning. However, the question for us is whether practising teachers also use this formal metalanguage with children in their everyday classroom talk, pedagogy, and literacy practice. This question is significant if we want teachers to explicitly teach children to be critically aware of the effect of texts’ messages in their daily lifeworlds (Harrington, 2016; Serafini, 2023).

This paper shares the findings of a small-scale study conducted in Australia from late 2020 to mid-2022. In this study, we explore the use of language by eight early years and primary teachers as they talk about multimodal texts in their everyday literacy practice. We are interested in what metalanguage was used by the teachers when discussing multimodality and multimodal texts in their everyday literacy practice, rather than how they used it. This is relevant considering multiple modes of communication are reflected in Australia’s English curricula (Australian Curriculum and Assessment and Reporting Authority [ACARA], 2023a). The paper begins by situating the study in literature around multimodal texts and how these texts are reflected in Australia’s English curriculum. Next, we look at the curriculum’s inclusion and exclusion of a formal metalanguage to describe and talk about linguistic, visual, audio, gestural, and spatial modes. The research design and methodology are then presented, followed by a discussion of the analysis conducted using Fairclough’s Critical Discourse Analysis (CDA). Finally, the paper presents the study’s findings and discusses implications. This study adds to existing research that calls for a formal metalanguage across all meaning-making modes to help children comprehend, create, and analyse text meanings.

2 Multimodal texts as a context for multimodal literacy

Multimodality texts combine two or more communication modes (Jewitt & Kress, 2003). One or two modes might sometimes dominate a text, although multiple modes often combine to communicate a message. As children move from engaging with print-based to digital or screen-based texts, the range of possible modes expands from static words and pictures to music, sound effects, and moving images (Bull & Anstey, 2019; Tour & Barnes, 2021; Unsworth & Chan, 2009). Picture books, movies, digital billboards, printed brochures, an influencer’s post on Instagram, a TikTok sporting challenge, and a YouTube clip of Horrible Histories are examples of multimodal texts where children encounter two or more meaning-making modes. As children engage with these kinds of texts in their homes, schools, and communities, they experience ways to comprehend, create, and critically analyse content. This acknowledges that literate practice is more than an ability to work with print and create written texts and that there is more than one way to understand content. Literacy practices, therefore, have always been and will continue to be, multimodal (Kalantzis et al., 2019; Mills & Unsworth, 2017).

The most recent iteration of Australia’s national English curriculum document, Version 9.0, published on May 9, 2022, sets out its expectations for what Australian children should be taught (ACARA, 2023b). It continues its tradition of English knowledge being organised into three strands: language, literature, and literacy. These strands help children develop their listening, reading, viewing, speaking, and writing skills while improving their knowledge and understanding of English. Children use texts where written and oral meaning-making modes are mixed with other modes. The Australian Curriculum: English (AC: E) calls for children to be involved in speaking and listening, reading and viewing, and writing “oral, print, visual and digital texts, and using and modifying language for different purposes” (ACARA, 2023b). For example, in the early years of formal schooling, children are expected to “understand how language, facial expressions and gestures are used to interact with others” (AC9E1LA01) and “compare how images in different types of texts contribute to meaning” (AC9E1LA08) through to primary when children must “explore the effect of choices when framing an image…of still and moving images in texts” (AC9E4LA10) and “explain how the sequence of images in print, digital, and film texts affects meaning” (AC9E5LA07). What is clear is that the latest version of the AC: E continues to explicitly acknowledge that multiple modes of communication are fundamental to children’s “ability to learn at school and to engage productively in society” (ACARA, 2023c).

3 Formal metalanguage and the English curriculum

The Australian Curriculum Glossary from Version 8.4, reproduced for Version 9, discusses a mode as a resource for communicative means such as “listening, speaking, reading/viewing and writing/creating” (ACARA, 2023d). “A combination of two or more communication modes” defines a multimodal text (ACARA, 2023d). The glossary identifies and defines formal metalanguage for teachers to use when describing and talking about written and spoken words. For instance, there is a formal metalanguage for letters (e.g. grapheme, consonant cluster, segmenting, digraph), grammar (e.g. adverbial, verb groups, text connectives), sentence structure (e.g. complex, compound, tense), vocabulary (e.g. synonym, antonym, pun), and punctuation (e.g. semicolon, ellipsis) (ACARA, 2023d). To a slightly lesser extent, there is advice on formal metalanguage associated with visual meaning, such as salience, camera angle, and framing. The broad term “visual features” includes “placement, salience, framing, representation of action or reaction, shot size, social distance and camera angle” (ACARA, 2023d). However, these visual features are not significant enough to warrant individual definitions. The glossary does not include colour, gaze, reading pathways, vectors, symmetry, balance, and typeface.

Interestingly, the glossary includes limited formal metalanguage to comprehend, create, and critically analyse gestural, audio, and spatial ways of making meaning. Gestural is not defined, and the metalanguage of gestural meaning is lacking; only “body language” is mentioned (ACARA, 2023d). Gestural features such as facial expressions, gestures, or hand positions are overlooked. The glossary does not include a definition of audio but does include advice on “sound effects”. The pitch, tone, duration, and intonation do not appear. Like gestural and audio, there is a lack of advice on formal metalanguage related to spatial meaning, with no mention of proximity, direction, distance, or physical arrangement. Compared to linguistic and visual modes, Australia’s English curriculum’s direction for teachers on a formal metalanguage for audio, gestural, and spatial is wanting. This is perplexing, given that the curriculum includes audio, gestural, and spatial formal metalanguage in English content descriptors, such as “language, facial expressions and gestures” (AC9E1LA01) and “feelings and preferences might be communicated in speech and gesture” (AC9EFLA02).

In the Australian Curriculum, gestural, audio, and spatial ways of making meaning are less explicit, possibly due to the focus on print-based literacy, which is easier to standardise and assess. Standardising and assessing these modes present numerous challenges due to the complexity of capturing and evaluating them uniformly. Gestural, audio, and spatial modes are highly subjective and open to interpretation. For instance, gestures can have different meanings in different cultures, influencing their interpretation (Gullberg & McCafferty, 2020). Unlike traditional written and verbal communication skills, which have clearly defined assessment criteria, gestural, audio, and spatial modes involve personal expressions and context-dependant nuances that complicate standardisation (Jewitt, 2008; Kress, 2010). Such challenges highlight the need for ongoing research and the development of frames to better assess gestural, audio, and spatial ways of making meaning. While they are less explicitly outlined in the Australian Curriculum, there is a growing push to integrate these modes more thoroughly to reflect the realities of contemporary communication and learning needs (Kalantzis & Cope, 2020; Walsh et al., 2021).

Evidence-based research indicates that all modes should be treated with equal explicitness and rigour in curriculum documentation (see Barton, 2021; Lim et al., 2022). Equal explicitness allows teachers to talk about and explicitly model how modes convey meaning in a text (Eisner, 1991). Additionally, a formal metalanguage permits a greater depth of meanings to be conveyed, especially when multiple modes are involved (Jewitt et al., 2016). Without it, children may not see and understand how multimodal texts are put together, particularly those they are expected to create. For example, years 3 and 4 children are expected to “plan, create, edit and publish imaginative, informative and persuasive written and multimodal texts, using visual features, appropriate form and layout” (AC9E3LY06; AC9E4LY06) (ACARA, 2023b). This requires teachers to explicitly deconstruct and reconstruct these texts while using a formal metalanguage to help children see and understand the complex ways linguistic, visual, audio, gestural, and spatial modes work together to convey meaning (Zammit, 2015). Children can use their understandings to create multimodal texts, demonstrating their English learning outcomes and multimodal literacy practices.

Therefore, a formal metalanguage must be readily available to teach children how multimodal texts are planned, created, edited, and published (Callow, 2023; Lim et al., 2022). This will help children to become confident users, analysts, designers, and creators of multimodal texts (Serafini, 2023). Some teachers may assume that because most children’s homes, schools, and community lifeworlds are awash with texts that use different meaning-making modes, they are already competent in using, designing, creating, and analysing such texts. However, as cautioned by Zammit (2015), it should not be presumed. To help children to comprehend, create, and critically analyse multimodal texts, they need a formal metalanguage to talk about modes and how they merge into an impactful overall form for the reader/viewer. Importantly, this metalanguage needs to be extensive for all modes. Linguistic and visual modes are only two of many communication modes. Jewitt (2009) noted that linguistic, visual, audio, gestural, and spatial modes equally offer potential for meaning-making. Therefore, the national English curriculum should not only identify and define formal metalanguage associated with linguistic and visual modes but rather draw equal attention to all modes.

4 Using formal metalanguage with children

Several studies investigate the development of formal metalanguage to help children engage with, critically think about, and create multimodal constructions. Central to these studies is the use of different texts. Such texts range from children’s picture books (Callow, 2020; Macken-Horarik, 2016) and guided reading books (Aukerman & Chamber Schuldt, 2016) to scientific diagrams (Williams & Tang, 2021) and art, films, and music (Barton, 2021; Barton et al., 2022; Noad & Barton, 2020; Tomlinson, 2016). Video games as text also feature (Lowien, 2016). These studies reveal that teachers use formal metalanguage across different modes. Macken-Horarik’s (2016) case study research, for example, explored how children interpret visual and verbal modes in picture books. The study identified formal metalanguage as a fundamental aspect of this interpretive knowledge. Teachers used talk and guiding questions to bring great visibility to the visual and verbal metalanguage of the AC: E. Access to this knowledge is considered powerful, as it enables children to move between different modes of meaning-making in interpreting images and language (Callow, 2020; Harrington, 2016; Macken-Horarik, 2016).

Williams and Tang (2021) also highlight the importance of a formal metalanguage as a form of interpretive knowledge. In their case study of a bilingual Year 5 Science class, children explored visual metalanguage as an additional resource to construct scientific explanations. The study identified that specific terms are helpful in raising awareness of the form and function of visual elements. A simple checklist with terminology was used to aid in the introduction and explicit teaching of visual elements. Using the checklist, children explained scientific phenomena through visual representations such as illustrations and diagrams, utilising visual elements such as images, arrows, and lines. Like Macken-Horarik’s research (2016), Williams and Tang (2021) found that talk and guiding questions, alongside the checklist, made the visual elements explicit as children constructed their visual work. This highlights the importance of teachers using talk and questioning to develop interpretive knowledge and associated formal metalanguage, as Geoghegan et al. (2013) emphasised.

Adding to these studies is Barton et al.’s (2022) small-scale study of two arts teachers, an early years teacher and a middle years music teacher. Barton et al. (2022) found that the teachers “intuitively and consistently” (p. 2) used formal metalanguage with children to express and make meaning through visual arts and music practices. This led to children’s fluency across modes of meaning in understanding, interpreting, and creating art and music. The study acknowledged formal metalanguage as a crucial element of this interpretive knowledge, similar to Macken-Horarik (2016) and Williams and Tang (2021). However, what seems to be required is further research on how and to what degree teachers use formal metalanguage to describe, interpret, and understand spatial and gestural meaning-making modes in various texts. This is important because if children are to be effective multimodal text users, analysts, designers, and creators, teachers need a curriculum that focuses on all modes of meaning-making and associated formal metalanguage to help children describe, interpret, and understand how modes communicate meaning within disciplines of learning throughout their schooling lives.

5 Research design and methodology

The data on which this paper draws emanates from a larger global study of teachers’ use of formal metalanguage when using multimodal texts in the early and primary years of schooling. In Australia, the early years extend from birth to year 2 and primary schooling from year 3 to year 6 (ServicesAustralia, 2023). We framed the study using British sociologist Basil Bernstein’s conceptualisations of schools and education. According to Bernstein (1996, 2000), schools, as pedagogic sites, can be considered social, political, and cultural classifiers through the three ‘message systems’ of curriculum, pedagogy, and assessment. Bernstein argues that knowledge related to what needs to be learned is transmitted through these messaging systems. The data presented in this paper focuses on the message system of pedagogy to answer the research question: What can we learn about the metalanguage teachers use when they talk about multimodal texts in their everyday practice? We used British sociolinguist Norman Fairclough’s (2003) critical discourse analysis (CDA) to analyse teachers’ metalanguage. As Luke (1997) noted, CDA is a useful analytical tool for exploring educational topics related to curriculum, pedagogy, and assessment in schools.

5.1 Participants

Snowball sampling (Coleman, 1958) was used to garner participants for individual interviews through a professional organisation network. Data was collected using snowball sampling as a “convenience sampling” between 2020 and 2021 (Heckathorn, 2011, p. 357). The Research Ethics Committee of the Queensland University of Technology met and approved ethical requirements. Informed written consent was obtained from each participant during recruitment, while ongoing consent was obtained at the start of the interviews to reaffirm their willingness to participate in the study. Interviews were scheduled outside school hours. Each participant was interviewed once for approximately 60 min, using Zoom in a semi-structured format. Below are some of the questions underpinned by Bernstein’s (1996, 2000) message system of pedagogy that were asked of participants.

  1. 1.

    What kinds of multimodal texts do you use in your everyday teaching?

  2. 2.

    What features make them multimodal?

  3. 3.

    What opportunities do children have to engage with, use and create multimodal texts?

  4. 4.

    How does the Australian Curriculum: English support your understanding of multimodal texts? Has this understanding changed over time?

  5. 5.

    What are the affordances and challenges of using multimodal texts in your everyday teaching?

Participants described in this paper are eight Queensland-based teachers, ranging from early career to experienced. All teachers had experience with Australia’s English curriculum. There were five highly experienced educators, each with over 17 years of teaching in a classroom setting. One of the five had experience teaching in an International Baccalaureate school in Scotland and Germany. At the time of data collection, these teachers taught in the early years of formal schooling at metropolitan schools. One teacher had ten years of teaching experience in the early years but was currently teaching at a regional Kindergarten. In Queensland, Kindergarten is a part-time program designed for children the year before they start Preparatory, the first year of formal schooling. The remaining two teachers were in the early stages of their careers and had less than 5 years of teaching experience. One teacher taught preparatory at a metropolitan school, while the other taught a year 3/4 composite class at a regional school.

5.2 Data organisation

Data analysis involved a step-by-step process. Audio-recorded interview data were transcribed and converted into 51 pages of transcription. Data were de-identified to protect participants’ identities. To ensure accuracy, transcripts underwent verification by cross-checking them with the original recording. Deductive coding was utilised by starting with three pre-determined codes—curriculum, pedagogy, and assessment (Bernstein, 2000). To visually identify each code, we attached colour-coded Post-it notes to chunks of data. Orange Post-it notes represented curriculum, blue represented pedagogy, and green indicated assessment. Next, we used thematic analysis to identify patterns, consolidating several themes within codes of curriculum, pedagogy, and assessment (Braun & Clarke, 2006). As stated earlier in this paper, we focus on data about Bernstein’s message system of pedagogy (1996; 2000). Using pedagogy as a lens, we focused on teachers’ language as they talked about and described the use and creation of multimodal texts. We will now explain the analytic approach used for the textual analysis of teachers’ language.

5.3 Analytic framework

Fairclough’s approach to Critical Discourse Analysis (CDA) was used to investigate teachers’ language associated with multimodal texts in early years or primary classrooms. We were interested in what happens in classrooms as a social event, and the language teachers use to discuss and describe multimodal texts. This involved analysing text, or in the case of this study, teachers’ metalanguage, as an element of the social event. We acknowledge that the individual, 60-min Zoom interviews offer a limited view of the classroom as a social event. However, the eight interviews do provide insight into teachers’ “attitudes, desires, and values” (Fairclough, 2003, p. 27) as they discussed and described multimodality and multimodal texts in their classrooms. Hence, central to analysing text is recognising a relationship between texts and social events and the participants involved in the event.

To better understand the relationship between texts and participants involved in social events, Fairclough uses a framework consisting of three components: acting, representing, and identifying (Fairclough, 1989). This framework is valuable for analysing how individuals use language and communication to express intentions, ideas, and identities in a social context. First, Fairclough (2003) talks about ways of acting, such as speaking and writing, within a social event or how a text contributes to and situates itself within social interaction. In this study, we look at the sentence as an action. We examine the semantic and grammatical relations between sentences and clauses. We also look at types of exchanges and grammatical moods to emphasise a way of acting. Next, ways of representing are essential to understanding how language and discourse shape worldviews. Our analysis focuses on collocations and metaphors to reveal potential hidden ideologies and power relations. Finally, Fairclough’s ways of being refer to how people use language and interact with each other in social situations. This study examines mood, modality, and vocabulary use (Fairclough, 2003). Together, these components act as an analytic lens, supporting an exploration of acting, representing, and being to understand better the social perspective on the details of the text (see Table 1).

Table 1 Examples of textual analysis

6 Findings and discussion

With the explicit call by Australia’s national English curriculum for children to use different modes to make meaning, we wondered if teachers’ metalanguage when talking about multimodal texts was just as clear. In answering the overarching research question, What can we learn about the metalanguage teachers use when they talk about multimodal texts in their everyday practices? three main findings were evident. First, teachers incorporate multimodal texts in their literacy practices and feel confident utilising and creating them for children’s literacy development. Second, teachers’ use of metalanguage displayed tentativeness and uncertainty when talking about multimodal texts. Third, teachers use more informal, everyday language than formal metalanguage when talking about multimodality and multimodal texts. These findings were relevant to the teaching experience of the early years and primary teachers. Interestingly, there was no significant difference in teachers’ talk, regardless of the year level taught or the number of years of teaching experience. The significance of these findings is that if children are to be effective multimodal text users, analysts, designers, and creators, teachers need a formal metalanguage across all modes in Australia’s national English curriculum.

6.1 Using and creating multimodal texts

When we listened to teachers, they talked about using and creating multimodal texts in their literacy practice. Texts included, but were not limited to, picture books, videos of felt stories, roleplay and skit performances, class blogs, phonics songs and videos, e-books, and iPad apps such as Scratch, Explain Everything, Seesaw, and Reading Eggs. The texts used were paper-based, live, and digital. Teachers also talked about using different multimodal texts previously unfamiliar to them. For example, River, a year 1 teacher, talked about using OneNote with children to plan, organise, and share ideas.

We had to learn this whole new OneNote. And for us as lower teachers, we didn’t know OneNote. My kids had no idea what OneNote is.

From all eight teachers, it was evident that multimodal texts, both familiar and unfamiliar, were a part of their literacy practice. Moreover, some teachers mentioned that using some of the multimodal texts permitted parents/caregivers insight into their child’s literacy learning. River told of a parent/caregiver who expressed surprise upon seeing their child using OneNote tools to sketch and annotate a diagram, commenting, ‘I didn’t think they’d be capable of doing that’. Teachers’ talk reveals that they meet curriculum expectations for literacy teaching by using texts that combine written and oral meaning-making with other modes (ACARA, 2023a).

Teachers talked about working in collaborative partnerships with colleagues to find, use, share, create, and record multimodal texts. Examples of texts included online texts such as YouTube videos, Storyline Online, and Book Creator, as well as teacher-created texts such as pre-recorded morning segments or videos introducing a phonics focus. All eight teachers’ linguistic choices revealed a sense of solidarity with colleagues through personal pronouns. The use of ‘we’, ‘we’ve’, ‘our’, and ‘us’ implies a form of community-based learning, or as Fairclough (2003) calls a “we-community” (p. 150). Teachers described ‘working together’ with colleagues to develop ways of involving children in using and creating multimodal texts. This was demonstrated using emotive vocabulary, for example, ‘teamwork’, ‘coming together’, ‘worked together’, and ‘sharing between teachers’. Teachers’ expressions of ‘pretty cool’ and ‘very good’ revealed how teachers felt about sharing ideas and expertise with their colleagues. Bolded personal pronouns and underlined emotive vocabulary illustrate examples of teachers’ collaborative efforts in developing and using multimodal texts.

So, our team worked together, and we sent home procedures, a fairy bread procedure, and how to make a paper plate bug. [Flick]

We did a pre-recorded video explaining concepts…we did a phonics video each week. [Jax]

We found some really great phonics-type videos and songs that we’ve all continued to use. [River]

….but these two teachers were very good at supporting us all the time. [Eillis]

We had to make it all happen quite quickly last term….there’s more things like teamwork, coming together, cause [sic it was a group project. [Saskia]

Additionally, teachers’ linguistic choices revealed a sense of comfortableness or ease of working autonomously when exploring digital online platforms such as Storyline Online, YouTube, OneNote, and iMovie. This was apparent in teachers’ use of personal pronouns. According to Fairclough (2003), the use of ‘I’ contributes to “activation” (p. 150). Activation can denote the traditional grammatical definition of activation or active voice. Therefore, when using the personal pronoun ‘I’, teachers demonstrated their capability to act and make things happen, as noted by Fairclough (2003, p. 150). At various times in interviews, all teachers expressed agency through the personal pronoun ‘I’. Bolded personal pronouns highlight teachers’ willingness to take responsibility for developing and using multimodal texts, as shown in the following comments.

I spent a lot of time looking for links to good readings of texts and finding online things for children, especially for home reading and guided reading. [Saskia]

I had the audio on OneNote. So I made sure that I did the audio as well because they needed to hear my voice. [Jax]

….so, then I needed to learn how to cut the music, and how to put videos together, so I sourced all of that information. [River]

….the stuff I found was amazing. And I thought wow, I can be using this. [Ellis]

Interestingly, the personal pronoun ‘I’ in conjunction with adverbs of intensification such as ‘really’, ‘quite’, and ‘definitely’ suggested confidence in teachers’ statements. These intensifiers, or mood adjuncts, demonstrate a degree of finality and totality (Halliday & Matthiessen, 2004). Teachers’ talk revealed assuredness in their ideas and the desire for their audience to understand their decisions or viewpoints. For instance, when talking about the role of digital online platforms such as OneNote and SeeSaw in connecting with children and their parents/caregivers, a teacher commented, ‘…it was a really good way of connecting all the kids’. In this case, ‘good’ implied success, but the addition of ‘really’ intensified that success, forming the collocation ‘really good.’ Other collocations using the adverb ‘really’ included ‘really cool’ and ‘really keen.’

6.2 Tentativeness around multimodality and multimodal texts

Most teachers showed tentativeness and uncertainty when explaining multimodality and multimodal texts through their language choices. When asked to explain multimodality, five out of eight teachers’ responses included markers of modalisation of uncertainty or doubt (Fairclough, 2003). Mental process clauses such as ‘I think’, ‘I feel’, ‘I guess’ and modality words such as ‘might’ and ‘may’ showed unsureness or uncertainty. According to Liu and Fox Tree (2012), these two hedging devices are used when a speaker is not confident or comfortable in their answer. Lakoff (1973) and Jucker et al. (2003) agree that hedging is a measure of vagueness. The following comments are illustrative.

Um, I think it’s… It almost aligns more with the general capabilities with underpinning not necessarily doing an hour a week of multimodal stuff, it’s embedded...[Remy]

….that’s what I think it is, I may be wrong here but, I think, that’s my take on it anyway, I hope that’s the right answer. [Logan]

I think yeah….anyway….I’m not sure. [Flick]

I don’t know. I guess that it’s layered…multidimensional. [Ellis]

There were some instances where teachers’ language demonstrated some understanding of multimodality, such as ‘speaking and listening’, ‘sound and pictures’, or ‘written and non-written’. However, the language used indicated simplified thinking. In other words, teachers’ linguistic choices did not reflect the complex ways linguistic, visual, audio, gestural, and spatial modes work together to convey meaning.

When asked to explain multimodal texts and their features, some teachers’ linguistic choices included direct and declarative ‘statements of fact’, suggesting general confidence (Fairclough, 2003, p. 109). However, the statements often did not address the combination of different modes to explain varied meanings across text, image, space, body, sound, and speech. Furthermore, some comments seem to suggest a possible misinterpretation of multimodal texts. The comments below by four teachers serve as examples.

It’s not really a kindy thing [River]

…multimodal is trying to connect into your five senses [Parker]

…it was called functional grammar [Ellis]

…different genres, the way to write about one topic [Saskia]

‘I don’t know, do you have a list that I could choose from?’ [Jax]

Similar to the linguistic choices around multimodality, teachers also used mental process clauses such as ‘I think’ and ‘I guess’ and hedging words such as ‘maybe’, ‘pretty sure’, and ‘pretty positive’ when talking about multimodal texts. This indicates a certain level of uncertainty regarding multimodal texts and their features. Alternatively, it could signify a reluctance to declare a definitive statement of fact. Therefore, it seems the modal adverbs ‘probably’ and ‘entirely’ were used to find the safe middle ground between the two extremes of I know and I do not know. Here are some examples of statements that show uncertainty or tentativeness through the use of markers of modalisation.

Ummm… I’m probably going on a tangent here but…. [Logan]

I think, yeah, so instead of, I believe. Anyway…. I’m not entirely sure [Remy]

I’m pretty sure, I’m pretty positive. [Saskia]

Well, my understanding I guess is… [Parker]

I don’t know, from the top of my head, whether I’m right or wrong. [Jax]

I may be wrong here…. [Flick]

6.3 Use of everyday language

To direct teachers’ attention to the different meaning-making aspects of multimodal texts, we asked the question: How do children show their learning? This question invited teachers to use formal language to describe how children use, analyse, design, and create multimodal texts to show their learning. Three teachers used the formal terms ‘written’ and ‘visual’, but none used ‘linguistic’, ‘audio’, ‘gestural’, or ‘spatial’. While explaining and describing how children showed their learning, the eight teachers predominantly used everyday language to refer to most modes of making meaning. In other words, teachers preferred to use informal, conversational language. Teachers’ preference for everyday language may implicitly suggest a greater comfort or ease with this type of language. As Fairclough (2003) reminds us, “what is ‘said’ in a text always rests upon ‘unsaid’ assumptions” (p. 11).

Teachers’ everyday language was not specific to professional knowledge embodied in Australia’s national English curriculum or scholarly literature. Instead, everyday language was used to represent a formal metalanguage, as outlined in Table 2. Interestingly, the teachers used no everyday language or formal metalanguage for the spatial meaning-making mode. When explaining and describing a role-play activity, Ellis used everyday language such as ‘act it out’ and ‘dramatise things’, but did not use formal metalanguage connected with the gestural mode such as facial expressions, hand gestures, arm positions, and body language. Likewise, River used everyday language such as ‘draw’, ‘take photos’, and ‘voice it’ to explain and describe a text transformation activity where children innovated on the picture book The Rainbow Fish (Pfister, 1992). In these cases, teachers implicitly conveyed their understanding of gestural, spatial, audio, and visual modes without a formal metalanguage.

Table 2 Examples of teachers’ everyday language associated with modes

Furthermore, when we listened to teachers’ everyday language to explain and describe modes of making meaning, most teachers used lexical cohesion to reiterate near-synonyms for modes. Fairclough (2003) uses the term synonymy to mean ‘sameness of meaning’ (p. 53). Words such as ‘different ways’ and ‘other ways’ are everyday language examples of lexical cohesion via the reiteration of near-synonyms instead of the formal term, modes. For example, Saskia talked about using the picture book Wombat Stew (Vaughan & Lofts, 2014) to teach procedural text writing. Everyday language was used to explain and describe children’s multimodal learning experiences.

The procedural text, it might be Wombat Stew or something, and then they’ll act it out…and they write about it, they do a report on it. They do it in different ways, they’re understanding it in different ways. [Saskia]

Parker and Ellis talked about supporting children's literacy development using multiple modes. Parker used ‘different ways’ as a near-synonym for modes, while Ellis used ‘different layers’.

I make sure that I cover a lot of different ways to deliver that message… I know I have a huge range in my class… like a vast range of learning ability and styles… I need to be able to deliver it in other ways…either a PowerPoint, or a song, or hands-on. [Parker]

The different layers to what they do makes it that multimodal text. And I guess that’s exciting for children as well. To be able to do lots of different things and communicate in different ways. [Ellis]

Most teachers used synonymy as a function to create everyday language variations of the idea of linguistic, visual, audio, gestural, and spatial meaning-making modes.

7 Implications and conclusion

This study investigates early years and primary teachers’ language use when talking about pedagogic work with multimodal texts. This is timely given that the most recent version of Australia’s national English curriculum (AC: E) asks that children engage in and work with spoken and written texts and visual and multimodal texts (ACARA, 2023b). Findings suggest that teachers use multimodal texts in everyday pedagogy and literacy practice (Barton et al., 2022; Callow, 2020; Macken-Horarik, 2016; Williams & Tang, 2021). However, there is a tentativeness and uncertainty when discussing multimodality and multimodal texts. Teachers typically use everyday language instead of formal metalanguage. The consistency of language use among the eight teachers, regardless of their years of teaching experience or the grade level they teach, is significant. It highlights the potential impact of Australia’s English curriculum in shaping how they talk about multimodality and multimodal texts in their everyday literacy practice.

Therefore, careful attention is required to the impact of Australia’s English curriculum in teaching children to be sophisticated multimodal text users, analysts, designers, and creators. Our findings raise the broad question of the impact of teachers not having a formal metalanguage to discuss, create, and analyse text meanings with children. This is significant because children having the language to engage with and be critical analysts of texts is pivotal for their effective and lifelong learning and participation in society (ACARA, 2023c). Hence, the English curriculum glossary’s limited formal metalanguage associated with visual, particularly gestural, audio, and spatial modes, will likely perpetuate teachers’ tentativeness and uncertainty with a formal language to talk about multimodality and multimodal texts. Such a limited representation of formal metalanguage fails to acknowledge teachers’ crucial role in explicitly teaching children how modes work in multimodal texts (Serafini, 2023). Continued representation risks disempowering children, limiting their perceptions as confident and competent users, analysts, designers, and creators of multimodal texts.

Tensions emerge for teachers as skilled and competent teachers of multimodal literacy due to the lack of specificity of formal metalanguage across all modes in Australia’s English curriculum. Without specificity, teachers’ ability to teach children to comprehend, create, and critically analyse texts is narrowed. An overreliance on formal metalanguage associated with linguistic and visual modes constrains teachers’ talk about multimodality and multimodal texts. Continued skewing of metalanguage limits teachers’ ability to explicitly teach children how to connect language learning and interpretative knowledge (Macken-Horarik, 2016). Hence, children may not have the formal metalanguage to interrogate how modes work together to create subtle hints of persuasion and potentially reinforce gender or cultural stereotypes. As a result, children’s understanding of how texts work may be implicit instead of explicit without a formal metalanguage. Educational implications are significant if we want children to be more aware of the effect of passively viewing texts without thinking critically about the messages directed at them in their daily lifeworlds (Harrington, 2016; Serafini, 2015).

The findings of this small-scale study have to be viewed in light of some limitations. The first relates to the small sample size of eight teachers. As stated earlier, this research was conducted in late 2020 amidst the global COVID-19 pandemic and accessing teachers and their classrooms proved challenging as they faced unprecedented pressure in their work. The second limitation is the limited time available with the eight teachers. Only one 60-min interview was held with each teacher, and no work samples from children were collected. Thus, the language teachers used in this study may or may not reflect language in other learning contexts or schooling systems. However, findings prompt the recommendation to continue research into the language associated with teaching children how modes convey meaning in a text in a classroom setting. Consideration is also necessary for Australia’s English curriculum to include formal metalanguage for all modes, with equal explicitness and rigour. This is important if we want children to be confident and competent in using, analysing, designing, and creating multimodal texts.