Based on a general conceptual framework for analyzing communication and a specific definition of narration, it is now possible to pinpoint certain basic traits of media products that are significant for both communication at large and narration in particular. However important the surrounding factors of communication may be—discussed above in terms of collateral experience, gestalts, and schemata—it is ultimately the more inherent factors of media products that trigger the mind-work of communication and, to some extent, determine how and to what degree narration may be realized in various media forms. It is clear that one and the same perceiving mind, harboring a certain set of knowledge, experiences, values, memories, and schemata, will interpret different media products in very different ways even if they are perceived in comparable circumstances. This is obviously because the media products are unlike in various ways and because the divergences are highly relevant. In order to understand how narratives can be communicated by dissimilar media types, one must first understand the fundamental similarities and differences among media types and the extent to which these differences matter. Those are the issues to be explored in this chapter.

Degrees of Narrativity

We have already noted that media characteristics may be transmedial to lower or higher degrees. Transmedial capacities are molded by certain basic media traits, which means that different media characteristics may depend on different basic media traits. Narration is one of many transmedial media characteristics, and the question is to what extent narratives depend on certain basic media traits. This question cannot be answered in a straightforward and definite way for the simple reason that narratives, notwithstanding elaborate definitions, do not constitute a clear-cut group of virtual spheres. Furthermore, narratives that are realized by media products belonging to the same media type may differ greatly. The notions of event and meaningful temporal interrelations allow for varieties that are large enough to create a span of narratives, even within one and the same media type. It is therefore not self-evident that different narratives within one and the same media type depend on exactly the same basic media traits. Additionally, media types overlap extensively regarding their basic media traits, and it is not even certain that a certain media product can be classified successfully. In the end, one must realize that there is, on the one hand, a broad spectrum of individual virtual spheres that can be perceived as more or less narrative in partly dissimilar ways, and, on the other hand, a wide range of partly overlapping media types that have more or less narrative potential depending on their basic media traits.

Therefore, I, along with Seymour Chatman (1978), Marie-Laure Ryan (2006), Werner Wolf (2017), and many others (although these researchers are supported by different kinds of theoretical arguments) emphasize that narration is present in various degrees in different media products. This has become a broadly accepted concept within narratology. I also agree with the majority of researchers of transmedial narration that even if many media types can narrate, they cannot do it to the same degree; as Ranta put it: “narratives may be manifested in various genres or media, and meaning bearers of various kinds may be more or less narrative. Narrativity can thus be seen as a matter of degree rather than kind” (Ranta 2013: 3; cf. Herman 2004). Although the degree of perceived narration can sometimes, in the case of specific encounters with particular media products, be explained by surrounding factors of communication such as general background knowledge and cognitive schemata, it cannot in the case of overall narrative differences among media types (although media-specific background knowledge may self-evidently sometimes be crucial for perceiving narration in a certain media type). Whereas general background knowledge and cognitive schemata are relevant for the perception of all media types, they cannot explain why narration is realized differently in dissimilar media types. The differences in the kind and degree of narration in various media forms have their primary origin in more specific, basic media traits.

However, in order to track down basic media traits that allow for interpretations in terms of degrees of narrativity, it is not sufficient to consider only the traditional range of loosely demarcated media conceptions: literature, text, image, music, visual art, comics, television news, film, speech, and so forth. For instance, I would argue that it is not sufficiently precise to discuss literature as a narrative medium: there is a large difference between visual and auditory literature, and even if one sticks to visual, written literature, there are considerable differences among, say, a classical nineteenth-century novel, a postmodern novel, and a short poem. On the other hand, written, artistic literature has many basic features in common with other forms of visual, verbal media types such as pieces of journalism, personal letters, scientific articles, and even simple manuals. Furthermore, dichotomies such as text versus image and verbal versus visual are too vague to be operational. Even if they are specified, the notions of, say, a written, verbal text and a visual, two-dimensional image are very inclusive and incorporate several basic media traits that partly overlap (such as visuality). Thus, the dichotomy obstructs the clarification of relevant media similarities and differences. The equally widespread opposition between verbal and visual media types is simply a false and hence utterly misleading dichotomy. Whereas the verbal is related to semiosis—namely, the use of language and a specific way of making meaning through a specific form of signs (symbols)—visuality is a form of perception. The dichotomy of verbal and visual media types is equally warped as a dichotomy consisting of green cars on one hand and fast cars on the other.

To avoid such confusions, I advocate a more fine-grained and systematic way of describing and analyzing media similarities and differences. My contention is that media share basic traits that must be theoretically isolated in order to be clearly visible. To find out how narration can be understood as a transmedial concept, yet realized in partly different ways and degrees by different forms of media, one must get back to basics. Werner Wolf (2011: 170–173) took a step in that direction, but I will follow a model for intermedial relations that I have already developed (Elleström 2010). I propose that what I call modalities of media can be used as a framework for comparing narrative capacities. A modality shall be understood as a category of related basic and universal media features.

Thus, I suggest that all media products, without exception, can be analyzed in terms of four kinds of basic traits—four media modalities. As postulated earlier, media products are the entities through which cognitive import is shared in communication. The perception of media products is deeply entangled with cognitive operations that may broadly be called semiosis. I have already discussed this process of transferring cognitive import among minds in terms of mediation and representation; the presemiotic and semiotic. The concept of mediation highlights the material realization of the medium and the concept of representation highlights the semiotic conception of the medium.

The Presemiotic Modalities

Accordingly, three of the four media modalities should be understood as presemiotic, which means that they cover media traits that are involved in signification—the creation of cognitive import in the perceiver’s mind—although they are not semiotic qualities in themselves. Thus, the three modalities are not asemiotic; they are presemiotic, meaning that the traits that they cover are bound to become part of semiosis as soon as communication is established. The presemiotic traits concern the fundamentals of mediation, which means that they are necessary conditions for any media product to be realized in the outer world, and so for any communication to be brought about.

The three presemiotic media modalities are the material modality, the spatiotemporal modality, and the sensorial modality. Media products are all material in the plain sense that they may be, for instance, solid or non-solid, or organic or inorganic, and comparable traits like these—comparable modes of the modalities—belong to the material modality. It is also the case that all media products have spatiotemporal traits, which means that such products that do not have at least either spatial or temporal extension are inconceivable; hence, the spatiotemporal modality consists of comparable modes such as temporality, stasis, two-dimensional spatiality, and three-dimensional spatiality. Furthermore, media products must reach the mind through at least one sense; hence, sensory perception is the common denominator of the media traits belonging to the sensorial modality—media products may be visual, auditory, tactile, and so forth.

A thorough understanding of the conditions for mediation requires systematic attention to all three presemiotic modalities. It is clear that cognitive import of any sort cannot be freely mediated by any kind of material, spatiotemporal, and sensorial modes. To provide some rather obvious examples, complex assertions cannot easily be transmitted through the sense of smell, and it is more difficult to effectively transmit a detailed series of visual events through a static media product than through a temporal media product.

The Semiotic Modality

The fourth media modality is the semiotic modality that covers media modes concerning representation rather than mediation. Whereas the semiotic modes of a media product are less palpable than the presemiotic ones, and are in fact entirely derived from them (because different kinds of mediation have different kinds of semiotic potential), they are equally essential to realizing communication. The mediated sensory configurations of a media product do not transfer any cognitive import until the perceiver’s mind comprehends them as signs. In other words, the sensations are meaningless until they are understood as representing something through unconscious or conscious interpretation. In other words, all physical objects and phenomena that act as media products have semiotic traits by definition.

By far the most successful effort to define the basic ways to create sense in terms of signs is Peirce’s foundational trichotomy: icon, index, and symbol. These three sign types are defined on the basis of the representamen–object relationship and can be understood as fundamental cognitive abilities. Icons represent objects on the ground of similarity; they stand for something, they make some object present to the mind because of a perceived similarity between representamen and object. Indices stand for objects on the ground of contiguity or, more precisely, real connections. Symbols represent objects on the ground of conventions or, more generally, habits (1932, CP2.247–249 [c.1903]; Elleström 2014b: 98–113). The same object, such as a steam engine, can often be partly or fully represented by different kinds of signs: one may imitate the sounds and movements of a steam engine and hence form icons of it; one may point to a present steam engine or in other ways direct the attention to the smoke hovering over a railway track and thus create indices of it; or one may simply say ‘steam engine’ in order to produce a symbol of it. Importantly, not every perceived similarity, real connection, or habit necessarily leads to representation. For instance, one may note the visual similarity between two newspaper columns without construing one of them to be an iconic sign of the other. Again, signs must be understood as dynamic sign functions, not as static entities or automatic mental responses.

I take iconicity, indexicality, and symbolicity to be the main media modes within the semiotic modality, which is to say that no communication occurs unless cognitive import is created through at least one of the three sign types (icons, indices, and symbols). They are normally mixed in various ways. As with presemiotic modes, the semiotic modes of a media product offer certain possibilities and set some restrictions. Obviously, cognitive import of any sort cannot be freely created on the basis of just any sign type. For instance, auditory iconic signs (such as in music) can represent complex feelings and motional structures that are probably largely inaccessible to the symbolic signs of written text; conversely, written symbolic signs can represent arguments and the appearance of visual items with much greater accuracy than auditory icons. Obvious examples like these are only the tip of the iceberg in terms of the various (in)capacities of signs based on similarity, real connections, and habits. Therefore, communicative transfer of cognitive import through media products is made possible—but also profoundly limited—by the semiotic traits of the medium. Whereas these semiotic traits are not as definite as the presemiotic ones, they are always somehow anchored in the physical appearance of media products.

Therefore, I argue that a semiotic perspective must be combined with a presemiotic perspective. Communication at large, as well as the specific case of narration, is equally dependent on the presemiotic media modalities and the semiotic modality. What we take to be represented objects called forth by representamens or signs (separate objects such as persons, things, events, actions, feelings, ideas, desires, and conditions, and composite objects such as interrelated events in narratives) are results of both the basic features of the physical media product as such (the mediated material, spatiotemporal, and sensorial modes) and of cognitive activity (resulting in representation). While signification is ultimately about mind-work, in the case of communication this mind-work is fundamentally dependent on the physical appearance of the media product. Having said that, some semiosis is clearly more closely tied to the appearance of the medium, whereas other semiosis is more a result of interpretation, and therefore the setting of the perceiving mind.

Thus, the most fundamental restraining and releasing factors of communication are to be found in the basic presemiotic and semiotic modes of the media products. Many exceedingly complex factors are clearly involved when the perceiver’s mind forms cognitive import. My proposed model highlights one cluster of crucial factors in particular: media products have partly similar and partly dissimilar material, spatiotemporal, sensorial, and even semiotic modes, and the combination of modes partly determines what kinds of cognitive import can be transferred from the producer’s mind to the perceiver’s mind. Songs, emails, photographs, gestures, films, caresses, and advertisements differ in various ways concerning their presemiotic and semiotic modes and can therefore only transfer the same sort of cognitive import to a limited extent. Consequently, their narrative capacities differ.

Basic and Qualified Media Types

Up to this point, I have discussed the notion of media types in an unspecific way. The analytical framework of four media modalities makes it possible to now conceptualize the categorization of media with some accuracy. Although each media product is unique, thinking species such as humans feel the need to categorize things so that we can navigate in the world and communicate efficiently. We also categorize media products and, as is often the case with classification in general, our media categories are usually quite fluid.

However, some categories are more solid and stable than others because they depend on less variable factors. Therefore, I find it helpful to work with the two complementary concepts of basic media types and qualified media types (Elleström 2010: 24–27). Sometimes one mainly pays attention to the most basic features of media products and classifies them according to their most salient material, spatiotemporal, sensorial, and semiotic properties. For instance, we think in terms of still images (most often understood as tangible, flat, static, visual, and mainly iconic media products). This is what I call a basic medium (a basic type of media product) and it is relatively stable. However, such a basic classification is sometimes not enough to capture more specific media properties of interest. Therefore, one qualifies the definition of the media type in question and adds criteria that lie beyond the basic media modalities. One also includes all kinds of aspects of how the media products are produced, used, and evaluated in the world, and how they are situated in geography, history, and culture. One may wish to delimit the focus to still images that are, say, handmade by very young people; that is, children’s drawings. This is what I call a qualified medium (a qualified type of media product) and it is more fluid than the basic medium of still image simply because the added criteria are optional and more variable than those captured by the media modalities. For instance, it may be difficult to agree on what a handmade drawing actually is: should drawings made on computers or scribble on the wall be included? And when does a child become a young adult rather than a child? The notion of childhood varies significantly among cultures and also changes over time, not to mention the individual differences in maturity. Thus, the limits of qualified media types are bound to be ambivalent, debated, and changed much more than the limits of basic media types.

Basic media include classes like still images (solid, flat, static, visual, and mainly iconic media products), written verbal texts (solid, flat, static, visual, and mainly symbolic media products), moving images (solid, flat, temporal, visual, and mainly iconic media products), and spoken verbal texts (non-solid, temporal, auditory, and mainly symbolic media products). There are many basic media types that we have no proper names for in everyday language. Qualified media include classes such as political speech, music, instruction manuals, sculpture, television programs, emails, and news articles. As qualified media types may be qualified in many different ways, and as they are often requalified as time passes, they not only overlap in intricate ways but may also emerge, change, and fade away.

The distinction between basic and qualified media helps us realize that the concept of transmedial narratology is not as straightforward as one might think. Early in this treatise, I described the concept of transmedial narration, in its most general sense, as the idea that a multitude of different media types share traits that render them narrative capacities. Although still valid, this notion turns out to be more complex than expected. Investigating narrative capacities of dissimilar media types must include at least two stages, for the simple reason that there are different kinds of media types. Consequently, the distinction between basic and qualified media allows for a more methodical approach to transmedial narration.

This is what I suggest: Instead of immediately comparing a broad variety of different kinds of media types, such as the narrative potential of comics, written texts, computer games, literature, music, images, speech and gestures, and so forth—comparisons that tend to become rather specific—one should begin by comparing the basic media traits: what is the role for narration, if any, of the material, spatiotemporal, sensorial, and semiotic modes of media modalities? Such comparisons can be expected to result in a more fundamental and wide-ranging understanding of similarities and differences in narrative capacities among media types in general. This initial query, framed by the notion of basic media types, will be pursued in Part II of the treatise, where the core characteristics of narration are scrutinized. After such an investigation of those basic media traits, which brings together all media types onto a common conceptual platform, investigations and comparisons of qualified media types can be made. As qualified media types are much more restricted than basic media types, such comparisons are likely to result in a narrower, but also more detailed, understanding of similarities and differences in narrative capacities among media. This will be tried out in Part III of the treatise. Needless to say, only a very limited amount of exemplifying comparisons can be made there, although the instances are chosen to illustrate transmedial narration in a really broad spectrum of qualified media types.

The Overall Relevance of Media Modalities for Narration

Before finishing this last chapter of Part I, I will provide an initial overview of the role of media modalities for narration, as preparation for the more specific investigations in Part II. Although differences in modality modes are largely responsible for differences in kind and degree of narration in various media forms, examining them does not offer a convenient shortcut to full understanding. Consequently, this section will not provide any easy answers to the questions that are raised by transmedial narration. Thinking in terms of media modalities is not a quick fix. The basic presemiotic and semiotic traits are always embedded in complex surroundings, so they generally need to be analyzed in their interactions with each other and with additional factors. Nevertheless, modeling narration in terms of media modalities facilitates a methodical approach to the issue of transmediality. Having different material, spatiotemporal, and sensorial modes implies having partly dissimilar capacities for narration and, by the same token, the use of different sign types has consequences for narration.

The material modality is perhaps the least crucial category of media traits for determining narrative capacities. Solid media products such as written verbal texts, as well as non-solid media products such as spoken verbal texts, clearly have very high narrative capacity, as decades of intense research has demonstrated. Furthermore, organic media products such as moving human bodies, as well as inorganic media products such as dolls in motion, may form complex narratives.

The spatiotemporal modality is much more critical for narration. This is because the scaffolding core of narratives consists of represented events that are temporally interrelated. The key question then becomes the extent to which the representation of a temporal object requires a representamen with certain spatiotemporal qualities. There is not much to indicate that media products should have specific spatial traits in order to be able to narrate successfully. Moving human bodies and dolls in motion are three-dimensional and, indeed, very suitable for narration. Written verbal texts are two-dimensional, but also potentially superbly narrative media products. Spoken verbal texts emanating from a singular source are spatial only in a limited way, but are still well suited for narration.

However, there are some relevant differences between temporal and static media products. Moving images that are inherently temporal may effortlessly represent sequences of events and hence also elaborate narratives. This is not to say that the represented events are necessarily understood to be interrelated in precise accordance with the temporal unfolding of the media product. In contrast, still images are, by definition, static and are thus incapable of representing events that are inescapably perceived in a certain temporal order. This is not the same as being incapable of representing temporally interrelated events; it only means that the scope of possibly represented events is reduced (assuming that the size of the still image is not huge) and that the perception of possibly interrelated represented events is not strongly directed by the physical interface of the media product.

Nevertheless, the difference in spatiotemporal modes reduces the narrative potentiality of still images compared to moving images—at least if one considers media products constituted by single still images. However, it is possible to construe media products consisting of a whole set of still images. Whereas this does not in itself enhance the narrative capacity, it opens up for the use of a special kind of symbolic element, namely the convention of sequential decoding. Perceivers who have learnt to process parts of certain kinds of static media products in a regulated order may distinguish represented events in temporal sequences that are as stable as those produced by media products that are physically temporal.

This line of reasoning is also applicable to the difference between spoken verbal texts and written verbal texts: the distinction between temporal and static media products cuts through both images and verbal texts. Spoken verbal texts are temporal because the sensory configurations of such media products constantly change as time passes; written verbal texts are static because the sensory configurations of such media products remain the same from one moment to the other (unless, of course, the text is perceived while it is being written or is a part of a temporal, visual media product such as a film). This means that spoken verbal texts, just like moving images—given that a certain volume of temporal extension is allowed for—readily represent sequences of events and may therefore produce intricate narratives. In contrast, written verbal texts are normally static and if we think of written verbal texts in rough analogy with solitary still images—namely as consisting of single entities such as one letter or one word—written verbal texts are equally handicapped when it comes to representing events that are inevitably perceived in a certain temporal order. In the case of language, however, the convention of sequential decoding is so strong that written verbal texts are normally understood to consist of large sets of subordinate symbols that are bound to be decoded in a manner that is highly regular. As in the case of sequential decoding of still images, this may lead to the discernment of represented events that are temporally interrelated in a manner that is as stable as those formed by physically temporal media products. This is why so many researchers—misleadingly, I would argue—claim that written verbal texts are temporal. Such a conception obscures the difference between the physical appearance of representamen (the traits of the media product), the process of perceiving the physical appearance of representamen, and the virtual appearance of object (the traits of the virtual sphere ).

Thus, the fact that all kinds of media are perceived in time has some bearing on the capacity of representing temporally interrelated events: conventionalized orders of decoding may strongly enhance the narrative capacity of static media types. However, this does not erase the substantial differences between inherently temporal and static media.

Sensorial modality also plays a role for the narrative capacity of media products. This is mostly because the senses (understood here as the external senses) are not developed cognitively to the same degree. Sight and hearing are our two most advanced senses, in that they are strongly connected to complex cognitive functions such as knowledge, attention, memory, and reasoning. This means that sight and hearing are both well suited for narration. It is no coincidence that virtually all examples of narration in this treatise have so far included either the visual or the auditory sensory mode.

However, this does not exclude the other senses. The faculty of touch may be used for reading braille, for instance, or sensing the forms of reliefs and three-dimensional figures forming narratives. It is also fully possible to consider series of interpersonal touches that form casual, narrative media products. Children playing and adults having sex may well communicate elementary narratives by way of sequences of touches that are performed and located differently.

I presume that it would also be possible, in principle, to construe language systems mediated by taste or smell. In practice, however, they would probably be rather inefficient as a speedy decoding of symbols requires quickly performed sensory discriminations. However, taste and smell can no doubt be used to create at least rudimentary narratives. A well-planned meal with several courses served in a certain order may be construed as narrative to the extent that tastes and taste combinations may be developed, changed, and contrasted in such a manner that gives a sense of meaningfully interrelated events. A series of scents may be presented in such a way that represents, say, a journey from the city through the woods and to the sea, including encounters with people and animals with smells that reveal certain activities.

The three main modes of the semiotic modality are iconicity (based on similarity), indexicality (based on contiguity), and symbolicity (based on habits). All of these semiotic modes are immensely important for the realization of narration. Among those more acknowledged basic media types that are commonly reasonably well defined and have accepted names in ordinary language, a majority are saliently dominated by iconicity or symbolicity. Most of the recent examples of potentially narrative media types can clearly be characterized by a semiotic hallmark. Verbal texts, whether they are visual, auditory, or tactile, rely heavily—although certainly not exclusively—on symbolicity: the conventional meaning of letters, sounds, words, and so forth. Moving and still images, whether they are visual, auditory, or tactile, are understood to signify primarily through iconicity, based on perceived similarities between representamens and objects. Although series of touches, tastes, and scents are hardly acknowledged as media types in common parlance, a case could be made for recognizing them as basic media types dominated by indexicality: real connections between the perceived sensory configurations and what they stand for.

Furthermore, indexicality is an especially important semiotic mode for narration because it creates both internal coherence and external truthfulness (see Chaps. 8 and 9). Early on, Roland Barthes used the notion of index to frame some features of narration, but within a conceptual framework that differs fundamentally from mine (1977 [1966]).

For the sake of clarity, I have tried to isolate the possible contributions of various media modes to narration. By highlighting modal differences, it is possible to discern media traits that contribute to the gradability of narration. However, media products are normally more or less multimodal—in very different ways—which makes the above generalizations fuzzier, the differences among media types more subtle, and the issue of transmedial narration more multifaceted. What the model of media modalities can offer is not so much a lexicon of transmedial narrative capacities as a methodical approach to examining narration in a wealth of dissimilar media products and media types. In each specific media product and media type, the present modes of the modalities add, in profound interaction, to the forming of virtual spheres and possibly narratives. In a certain media product, the various presemiotic modes all contribute to forming certain sensory configurations: a cluster of physical representamens that together come to represent—iconically, indexically, or symbolically—a certain cluster of objects that possibly forms a narrative.

Therefore, I support Karin Kukkonen’s conclusion that “[i]f, with Ryan, we understand narrative as a cognitive construct, different modes in multimodal media work together to provide the reader with clues to fill gaps and formulate hypotheses” (Kukkonen 2011: 40). Importantly, however, I go beyond the rather coarse notion of mode used by Kukkonen and in so-called social semiotics in general: modes understood as text, image, gesture, and so forth. In the present treatise, multimodality is a more fine-grained concept that can be more precisely circumscribed as four kinds of multimodality: multimateriality, multispatiotemporality, multisensoriality, and multisemioticity. As already stated, it is more the rule than the exception that actual media products and media types have many modes of one and the same modality. For instance, media products that consist of both organic and non-organic materiality are multimaterial. Media products that are both spatial and temporal are multispatiotemporal. Audiovisual media products are multisensorial. Furthermore, many media are multimodal in several ways simultaneously.

Finally, most media products are multisemiotic to the extent that sign types typically work in collaboration. In an early article advocating the value of applying Peircean semiotics to the study of narratives, Robert Scholes suggested that “we cannot understand verbal narrative unless we are aware of the iconic and indexical dimensions of language” (1981: 205), and this is certainly true. Even though symbolic signs are clearly the most salient ones, verbal language does not work solely through symbolicity. In visual language, for instance, lineation, letter size, letter form, and empty spaces may create iconic meaning; in auditory language iconicity is often produced by certain sound qualities, intonations, rhythms, and pauses. By the same token, most media types signify through iconicity, indexicality, and symbolicity in combination, although they are typically dominated by certain kinds of sign functions. However, one can find instances of communication and narration characterized by such extreme multimodality that virtually all kinds of modality modes, both presemiotic and semiotic, are included.