1 Framing the Field

In the analysis of humanitarian discourse(s), I use ‘discourse’ in a Foucauldian sense as a system of representation of knowledge and meanings situated in a particular time and space (Foucault 1971, 1972, 1980). According to the philosopher, the concept of discourse is strictly interrelated with the production of truth and relations of power: “What I mean is this: in a society such as ours, but basically in any society, there are manifold relations of power which permeate, characterise and constitute the social body, and these relations of power cannot themselves be established, consolidated nor implemented without the production, accumulation, circulation and functioning of a discourse. There can be no possible exercise of power without a certain economy of discourses of truth which operates through and on the basis of this association. We are subjected to the production of truth through power and we cannot exercise power except through the production of truth” (Foucault 1980, 93).

Critical Discourse Analysis (CDA) is the perfect starting point for framing the theoretical field of my methodological approach. Building on the Critical Linguistic scholarship that since the 1970s has been concerned with the relationship between language and power (Blommaert and Bulcaen 2000), CDA is “fundamentally concerned with analysing opaque as well as transparent structural relationships of dominance, discrimination, power and control as manifested in language. In other words, CDA aims to investigate critically social inequality as it is expressed, signalled, constituted, legitimized and so on by language use (or in discourse)” (Wodak and Meyer 2009, 2).

Three main approaches have dominated CDA research. The first, elaborated by Fairclough (1992), considers language as discursive practice. The second (Wodak 2001) has put the emphasis on the historical dimension, while van Dijk (2015) has focused on the social cognitive aspect of discourse. What the three approaches share is a critical perspective that differentiates CDA from classical discourse analysis. The locus of critique has to be found in the problematization of power relations and the impact of ideology on discourse patterns (Blommaert and Bulcaen 2000).

Since its origin, CDA has primarily looked at discourse through the lenses of text, overlooking other modalities of expression and particularly the visual dimension (Wang 2014). Starting from the mid-1990s, a growing group of scholars (Slembrouck et al. 1995; Kress and Van Leeuwen 1996; Rose 2001) have stressed the importance of including visual material in analysis and started focusing on visual methodologies. The importance conferred to the visual dimension in academic research became crucial not only because of the massive presence of images of all kinds (such as photography, television, art, or advertisements) in our contemporary visual landscape, but also for the acknowledgement of the pivotal role of visuality in the process of meaning production and exchange, particularly in the Western society (Rose 2001). The term ‘visuality’ refers to the “ways we see, how we are able, allowed, or made to see, and how we see this seeing and the unseeing therein” (Foster 1988, ix). Since the world can be seen in different ways and the different way of seeing have different social impacts, the analysis of images becomes crucial to grasp the effects of hegemonic visualities in reinforcing dynamics of power and social difference (Haraway 1991).

With the same interest in the question of visual representation, and a specific focus on International Political Theory, Roland Bleiker (2001) has contributed to the debate with a seminal article on the Aesthetic Turn. Starting from the observation of the increasingly wider diffusion of images representing international political events, and “their highly arbitrary nature” (Bleiker 2001, 509), the author emphasised the importance of locating politics in the differences between what is being represented and its representation. Following Jacques Derrida (1967), this approach sees the representation as an interpretation of the truth. Therefore, a political event should never be investigated per se, but its representation should, rather, be at the centre of the analysis so as to unveil the “sets of true statements” beyond it (Bleiker 2001, 512). In fact, argues the author, although the human tendency is to trust the resemblance of what is represented with reality – part of the human “desire to order the world” (Bleiker 2001, 515) – we should acknowledge that representation is power.

Over the last decades, several authors have focused on visuality in International Politics (see among others Robinson 1999; Boltanski 1999; Shapiro 1999; Bleiker and Kay 2007; Campbell 2007). Particularly, an emerging body of literature of IR and security studies has highlighted the relationship between visuality and security, focusing on different topics, including the political implications of representations (Campbell 2003); cartography (Shapiro 2007); the politics of security and surveillance (Andersen and Möller 2013); borders (Andersson 2012); political cartoons (Hansen 2011); science fiction (Weldes 2006); images of (post) 9/11 (Möller 2007; Weber 2006), and iconology (Heck and Schlag 2013).

Among these authors, Lene Hansen on one side and Heck and Schlag on the other, have also offered some theoretical insights for the specific study of visuality and securitization that are particularly relevant for this book. Drawing on Buzan’s concept of securitization, Hansen has proposed an “intertextual framework” (Hansen 2011, 55) for the study of visual securitization which is able to investigate the ways in which visuality interrelates with other images (inter-visuality) and with words (intertextuality). According to the author, the intertextual framework is fundamental in order to explore the role of images in creating or participating in security discourse. She proposes four components for analysis: the image per se, the immediate intertext, the larger policy discourse, and the textual element. Hansen’s model is based on the specificity of images and the distinctive way they securitize an issue. Not only is it important to consider the particular features of images (such as immediacy, ambiguity and circulability), but also the various strategies of security depiction and the different genres of visual representations (including cartoons and other drawings, photography, and video). There are three aspects in Hansen’s approach to visual securitization that I find particularly relevant for the purpose of this study. Firstly, the implication of the circulability of images, that makes it possible to envisage the existence of non-elite securitizing actors. Second, the emphasis placed on the diverse “epistemic-political claims” (Hansen 2011, 53) of the different visual genres, that help in the problematization of photography in particular. Last, but not least, is her attention to inter-visuality, inter-textuality and the wider policy discourse as fundamental elements to contextualize the different meaning of images in time and space.

Drawing upon Hansen’ seminal article, this study seeks to expand this method. Not only will it investigate images, but also include in the analysis photo captions, interviews with image producers, NGOs communication strategies, and relief organizations’ humanitarian and advocacy positions. There are two minor, yet substantial, aspects in which my study differs from Hansen’s framework. The first relates to the methodological tools utilized to carry out the analysis of the images – and specifically my selection of a combination of visual social semiotics and iconology methodologies. The second is a more theoretical point. In her understanding, images are understood as having a limited securitizing potential. As they are unable to speak for themselves, images always need an actor – able to speak – to activate their securitization potential. I intend, instead, to explore humanitarian NGOs’ photographic accounts of Syrian displaced people assuming that images have an intrinsic securitizing potential. In this sense, the approach proposed by Heck and Schlag (2013) – looking at securitization through iconography – seems very useful to complement my analysis.

By focusing on the performativity power of visuality, Heck and Schlag (2013) draw on the iconological approach to theorize “the image as an iconic act understood as an act of showing and seeing” (Heck and Schlag 2013, 891). According to their method, images should be interpreted with their social context in mind, as images per se. The method proposed by Heck and Schlag (2013) to unveil securitization processes is based on three stages. They describe these as: “the pre-iconic description’”, “the iconographic analysis”, and the “iconological interpretation” of visual representation. The attention Heck and Schlag give to the potentiality of images to securitize through myth creation and narratives of justification are particularly apposite when investigating the book’s main question and unpacking humanitarian discourse(s) on the Syrian refugee crisis and NGOs’ role within global governance and global security.

2 A Semiotic Analysis of Images

The considerations and the different approaches outlined above are important to situate the analysis within a theoretical framework that considers that images and their study can unveil different humanitarian narratives, have securitization potential and drive the dynamics of constitution and dissemination of humanitarian discourses. However, the disentanglement of these mechanisms of knowledge production, and the power of the humanitarian discourse, require a certain level of operationalization. Semiotics is the perfect starting point to introduce the methodology selected for this study.

Semiotics is an area of research interested in the study of signs. With its origin in the ancient Greek world, semiotics is today applied in a wide range of different disciplines such as linguistics, religious studies, media and cartography (Nöth 2011). In semiotics, the sign (either an imagined or material sign) has to be understood in relation to both its referent object and the mental image or idea evoked (Peirce 1931, vol. 2). Its visual declination, visual semiotics, emerged in the 1960s with specific attention to visual language.

According to one of its founding fathers, Roland Barthes, there are two levels of meaning that need to be addressed in the semiotic analysis of images: denotation and connotation (Barthes 1972). The first step of analysis focuses on the identification of what van Leeuwen calls “literal message” (Van Leeuwen 2001, 94) – the Barthian “denotation” – and answers the question of what is depicted in the image. The second analytic stage is connotation and refers to ideas, values and concepts that are represented in the image. This level of analysis aims at identifying the cultural interpretations linked to specific aspect of images He argues that “such connotative meanings – in Mythologies (1972) Barthes called them ‘myths’ – are first of all very broad and diffuse concepts which condense everything associated with the represented people, places or things into a single entity (…). Secondly, they are ideological meanings, serving to legitimate the status quo and the interests of those whose power is invested in it” (Van Leeuwen 2001, 97).

Despite visual semiotics’ crucial importance in answering questions related to what is represented in the image and what are the meanings of the representation, there are two aspects in Barthes’ perspective that limit the potentiality of the analysis (Van Leeuwen 2001). The first has to do with the non-problematization of the concepts of denotation and connotation. Barthes considers the first level of meaning as if what is represented corresponds to reality without the interference of any encoding mechanism, without ambiguity, or without the possibility of different interpretations. Something similar happens with regards to the concept of connotation. The problem with this term is that, although its exploration is able to shed light on the process of condensation of values associated with the subject in a single image (and at the same legitimizing its representation), it considers the underling meaning as universally understood by different people in different times and places. These shortcomings result in a narrow focus on visual semiotics for the visual text, the lexis of the image, and an overlooking of the context, the visual syntax. In a visual analysis which takes into account intertextuality and the importance of the wider discourse around the images, the attention to the context is, on the contrary, crucial.

In this sense, social semiotics, with its emphasis on social dimensions, seems more able to grasp the social implications of visual material. In fact, this discipline is concerned with “the social dimensions of meaning in any media of communication, its production, interpretation and circulation, and its implications in social processes, as cause or effect” (Semiotics Encyclopedia Online 2018). With a particular attention to the study of images in their social context, visual social semiotics adds two additional levels to the representational level of analysis that I have outlined above: the interactional and the compositional. The first refers to the way what is represented interacts with the viewer. The second is concerned with the way images are included in the wider visual syntax.

In an article devoted to social semiotics in visual communication, Carey Jewitt and Rumiko Oyama (2001), situate the main difference between the structuralist school of semiotics and social semiotics in the notion of “semiotic resources”. The authors define resources as “at once the products of cultural histories and the cognitive resources we use to create meaning in the production and interpretation of visual and other messages” (Jewitt and Oyama 2001, 36). Unlike the concept of code used in semiotics to connect the sign to the meaning, resources enable us to explore and make sense of the different ways signs can be interpreted and assigned different meanings. Semiotic resources (such as the point of view of an image or the depth of focus in photography) are at the same time determined by the specific context in which they were created, and by the cognitive resources used to interpret images and their meanings. For this reason, the attention to semiotic resources implies attention to the ways the various ‘rules’ of interpretation came into being in a given cultural context, and the possibility of change in them.

Before moving to present visual social semiotics, a couple of considerations regarding semiotics resources are very important so as to use them appropriately as methodological tools of visual analysis. First, semiotics resources do not create meaning per se, but ‘meaning potential’: they make “possible to describe the kinds of symbolic relations between image producers/viewers and the people, place and things in the images” (Jewitt and Oyama 2001, 135). These meaning potentials are activated by the producers and the viewers of the images and do not, of course, convey a fixed meaning. However, they refer to a limited spectrum of meanings. Furthermore, it is important to keep in mind that the symbolic relations are indeed symbolic and very different from ‘real’ relations in the sense that their representation can purposely subvert real relations.

3 Visual Social Semiotics

Visual social semiotics is based on Michael Halliday’s conceptualization of the three metafunctions of semiotic work: ideational, inter-personal and textual (Jewitt and Oyama 2001). The first has to do with the creation of representation, the second with the relation between the producer and the receiver of the text, and the last one to how these two functions work within their specific communication genre. Kress and Van Leeuwen (1996) have adapted Halliday’s framework to the study of images and classified the three tasks of visual semiotics as representational, interactive and compositional. It is worth at this point presenting these three levels of analysis in detail because they will constitute the backbone of my analytical grid.

4 The Representational Meaning

The representational meta-function looks at the participants of the image, i.e. the people, object and places represented and, most importantly, at visual syntactic patterns that put the participant of the images in relation to each other. The structure dimension is important because it creates “meaningful propositions by means of visual syntax” (Kress and Van Leeuwen 1996, 47). The authors identify two kinds of representation: the narrative and the compositional. It is very important to notice that the choice among the two patterns is significant. For the choice to depict something in a narrative or conceptual way offers a “key to understanding the discourses which mediate their representation” (Van Leeuwen and Jewitt 2001, 141). In fact, visual structures do not simply mirror the structures of ‘reality’. On the contrary, they create images of reality that are linked with the interests of the social institutions in which the images are created, disseminated, and used. “They are ideological. Visual structures are never merely formal: they have a deeply important semantic dimension” (Kress and Van Leeuwen 1996, 47).

4.1 Narrative Structure

The narrative structure refers to the way the different elements of the image are in relation one to another. The elements depicted are the represented participants – regardless of their humanity or non-humanity – and are distinguished from the interactive participants, namely producer and the viewer of the images. The relation among the represented participants can be of three types: transactional (characterized by the presence of a vector); locative (the contraposition between foreground and background given by the overlapping of shapes, the color saturation or the depth of focus), and instrumental (represented through the gesture of holding something). The main feature of narrative representation is the vector, a line that connects the various participants of the image. It can be represented by the position of a body, a hand pointing toward something, objects connecting represented participants (such as a weapon, camera, or toy) or eyelines.

Because narrative structures describe an action in its unfolding, the function of the vector of guiding the viewer through the narrative pattern is crucial and distinguishes narrative representation from conceptual ones that depict participants in their abstract meaning, in their essence. In photography, there are two kinds of represented processes: action or reaction. In action processes, the participants can be actors (from whom the vector, the action, generates), or goals (to whom the vector, the action, is directed). Whereas actors are always present in narrative pattern, goals can be absent. According to the presence of absence of a goal, we will talk, respectively of transactive or non-transactive action. When the vector is represented by eyeline, the process is of reaction. In this case the represented participants constitute the reacters and the object of their gaze “the phenomenon”.

Another important aspect relating to the narrative structure is the different way through which participants can be put in relationship to each other in the image. Visual social semiotics individuates three types: conjoint (when participants are put in connection by a vector); compounded (when they are combined together but they have distinctive identities), and fused (when participants are fused together and their separate identities disappear). As Kress and van Leeuwen have pointed out: “each successive step further obscures the act of predication, the explicit act of bringing the two participants together, until the structure is no longer ‘analytical’, no longer analysed or analysable. We make the point at some length because of the (ideological) significance of this semiotic resource in configuring the represented world” (Kress and Van Leeuwen 1996, 53).

The analysis of the narrative structure includes the description of the settings, appearance of the represented participants, the props and the symbols present in the image.

4.2 Conceptual Structure

The conceptual structure represents the participants according to their general characteristic: “in terms of their more generalized and more or less stable and timeless essence, in terms of class, or structure or meaning” (Kress and Van Leeuwen 1996, 57). The authors identify three main kinds of conceptual representation: the classification, the analytical and the symbolic processes. The classification process refers to the representation of participants in a particular form or relationship to each other: that of taxonomy (which can be overt or covert according to the degree of explicitness of the overarching category), flowchart or network. The analytical process represents the relationship between the various parts and their whole structure: the parts are called possessive attributes and the whole the carrier. The analytical process is defined by the absence of vector, classification or symbolic process, and has a wide range of different structures such as temporal, topological or topographical, unstructured, exhaustive and inclusive. Finally, there are symbolic processes: the structures that represent the meaning of the participants.

These structures can be attributive (when the meaning of one participant, the ‘carrier’ is established through the meaning of the symbolic attitudes) or suggestive (when the ‘carrier’ represents the meaning in itself). Symbolic attributes are identified through their significant saliency, their position out-of-place, participants’ gestures pointing at them, or their conventional social value. In suggestive symbolic structures, instead, the participant represents the meaning and differs from the analytical representation because of the de-emphazization of details and the use of modalities (see further on in this chapter) that maximize its generic quality and its timeless essence. Jewitt and Oyama (2001) have pointed out how this part of Kress and van Leeuwen’s analysis draws from iconography and, as we will see, how iconography can complement visual social semiotics and be particularly helpful in identifying symbolic attributes and other visual motifs.

5 The Interactive Meaning

The interactive meaning is interested in grasping the relationship between the producer of the image and the viewer. Although their interaction can be direct and immediate (such as in the case of people taking pictures of each other as souvenirs), Kress and van Leeuwen note how the context of production and the context of reception are often disjoint. Disjunction aside, however, the producer and the viewer still share the image and “a knowledge of the communicative resources that allow its articulation and understanding, a knowledge of the way social interactions and social relations can be encoded in images” (Kress and Van Leeuwen 1996, 115). In visual communication, not only are social relations but also the relations between the producer and the viewer, represented, instead of enacted. This representation is created through different type of resources.

5.1 Contact

Some images establish a clear contact with the viewer. This is done through a vector (eyeline, or gesture) connecting the represented participants to the viewer. These kinds of images perform two key tasks: they both directly address the viewer and also constitute an “image act” (Kress and Van Leeuwen 1996, 117). Kress and van Leeuwen base the notion of “image act” on Halliday’s concepts of “speech functions” that identifies four core speech acts and two reactions (expected and discretionary) for each: offer of information (social response: agreement or contradiction); offer of goods and services (social response: acceptance or rejection); demand of information (social response: answer or not answer), and demand of goods and services (social response: respond to the quest or not respond). When the gaze of the represented participant looks directly at the viewer, the producer is using the image to ask something of the viewer: an action, establishing a relationship, or creating an emotional bond. What kind of reaction the images are invoking depends on the details of the kind of look (perhaps probing, friendly or submissive) or gesture (perhaps inviting, defensive, or vexing).

On the contrary, when there is no eye contact, the images put the viewer in a voyeuristic position as unseen spectator. Following Halliday’s classification, these images that do not address directly the viewer, are called “offer images” in contraposition to the images discussed above that belong to the “demand” category. In this case, the represented participants are offered to the viewer as “items of information, objects of contemplation, impersonally, as though they were specimens in a display case” (Kress and Van Leeuwen 1996, 119). As the authors make clear, these core types of images and a variety of sub-types and variation are possible along the contact resource spectrum. The function performed by the contact resource is, therefore, extremely important inasmuch as it indicates a specific kind of relationship between the viewer and the represented participant, suggesting with whom ‘we’, the viewer, should relate and who ‘we’ should just observe, and consequently who is the ‘other’.

5.2 Distance

Distance is another way through which visual material depicts the relation between the viewer and the represented participants. Similarly to contact, distance is a term that refers to a continuum of the size of frame that can go from what is technically called a close-up to a very long shot. Drawing on the work of Edward Hall, Kress and van Leeuwen point out how, at the visual level, social distance is represented through the size of frame. A close shot corresponds to a close (or even intimate) social relation, whereas a very long shot corresponds to social distance. Visually this is represented along a continuum that goes from the depiction of only the head of a person to the portrayal of the full body (or bodies), including some headroom. In other words, the shorter the distance the stronger the connection, the social intimacy, with the represented participants and vice versa. In this sense, the authors’ quotation of a painter, Grosser, is significant. The passage describes how the viewer will be forced to observe the ‘soul’ of the person portrayed at a distance of less than 90 cm while “at a distance of more than 13 feet (4m), people are seen ‘as having little connection with ourselves’, and hence ‘the painter can look at his model as if he were a tree in a landscape or an apple in a still life”’ (Kress and Van Leeuwen 1996, 125).

As in the case of contact, the distance is a powerful dimension of the interactive meaning. It creates, through a certain kind of representation, an imaginary relationship between the viewer and the represented participant, contributing to defining the people with whom we have a close or a far social distance and who are thus strangers to us. Although, as underlined above, representation is always about imagined relationships, not enacted or real ones, it is still extremely important to acknowledge its potential in creating a, more or less strong, social connection with the represented participants. As Kress and van Leeuwen have noted, as well as social distance, other important meanings (such as respect, objectivity, or authority) can be suggested by the conventional use of distance patterns: medium-close up with captions for the representation of experts speaking about an issue, close up for people telling their stories and diagrams for objective information.

5.3 Perspective

Perspective is another important dimension of the interactive meaning highlighted by Kress and van Leeuwen. This technique was firstly introduced in pictorial art during the Renaissance and used to represent depth and space on a two-dimensional surface. It provided the illusion of a stronger connection between reality and its representation and, at the same time, naturalized a point of view that was, on the contrary, socially determined.Footnote 1 Connected with the perspective and the concept of vanishing points (the points where the parallel lines seem to converge in a perspective image) is the notion of point of view, an important semiotic resource. The point of view indicates the position of the image producer toward the represented participants and the relationship among them thereby represented. It may have different angles, each of which represents power, involvement or detachment.

As with many of the other semiotic resources analysed so far, the angle of the image should be understood as a continuum of the whole range of possible points of view. Schematically, at the horizontal level, the image can have a frontal or an oblique angle. At the vertical level, a high angle represents a relationship of power of the viewer toward the represented participant while a low angle signifies the opposite and an eye-level angle a relationship of equality. Obviously, a wider range of nuanced meanings can be produced through all intermediate points of view within these visual perspectives extremes.

6 The Compositional Meaning

The compositional meaning refers to the way the representational and interactive meanings relate to each other and “the way they are integrated into a meaningful whole” (Kress and Van Leeuwen 1996, 176). Compositional meaning acquires, if possible, even more value in multimodal texts (texts that combine different semiotic modes such as written text and images), that comprise most of the data anlysed for this book. Kress and van Leeuwen consider three key elements of composition in their 1996 book and treat modality in a separate chapter. However, later on, Jewitt and Oyama (2001) include modality in this layer of analysis. Since all images considered in this study share the same modality (they are all photographic images), this aspect will require more attention. Although briefly introduced here, it will be the subject of a separate section (see 3.8 below).

6.1 Position and Information Value

The first element of composition is the position and the different information values Contained within the elements through their position in relation to the other elements. This dimension has to do with the represented participants’ respective positions within the image, the respective position of two or more images, or the position that an image has with respect to text in a page (can be a newspaper page, or in the case of this study, a website page). The information value of different images is encoded into their left-right, top-bottom or center-margin positions and the three compositions can be found combined together. When represented participants or pictures are composed through a horizontal axis, the one positioned on the left will refer to what is ‘given’ – something unproblematic, agreed upon, self-evident – while the element of the right will refer to what is ‘new’, what is or should be at the center of attention, what is not known or agreed yet. Kress and van Leeuwen point out that although the statement made by this specific composition may be contested, or even denied, by the viewer, its ideological value lies in presenting the information in a particular way, conferring it a given new meaning. The second coding orientation confers different meaning according to the position at the top or the bottom of the composition. It usually presents less visual connection, if not even contrast, among the two elements. The image on the top represent the “ideal” – the promised situation, what might be – while the element on the bottom refers to the “real” – the situation how it is, empirics, sometimes even “directions for action” (Kress and Van Leeuwen 1996, 186). Finally, regarding the center-margin composition, putting an element at the center emphasises its core role and its predominance toward the element positioned around it. Of course, as the authors make very clear, those “coding orientations” are culturally determined and vary according to the diverse directionalities in different cultures. Thus, for example, in languages written right to left like Arabic the direction of images reading will follow the right to left orientation.

6.2 Salience and Framing

Another important element of the composition is the salience, that is the relative importance of the elements of the image. The more salient elements would be those that, by means of technical expedients, draw the attention of the viewer. Salience, as Kress and van Leeuwen explain, “is not objectively measurable, but results from complex interaction, a complex trading-off relationship between a number of factors: size, sharpness of focus, tonal contrast (e.g. high contract black and white images), colour contrasts (for instance, the contrast between strongly saturated and ‘soft’ colours, or the contrast between red and blue), placement in the visual field (elements not only become ‘heavier’ as they are moved towards the top, but also appear ‘heavier’ the further they are moved towards the left, due to an asymmetry in the visual field), perspective (foreground objects are more salient than background objects, and elements that overlap other elements are more salient than the elements they overlap), and also quite specific cultural factors, such as the appearance of a human figure or a potent cultural symbol” (Kress and Van Leeuwen 1996, 202).

The third element of the composition – framing – has to do with the degree to which the represented participants are connected, disjoined or separated to each other. A specific framing may be connectedness, discontinuity, or anything in the middle and is obtained through the use of colors, contrasts, white spaces, and vectors.

6.3 Modality

The last dimension of the compositional meaning is modality. This is defined and measured as the credibility or true value of the image. It does not imply the actual correspondence between representation and reality. Rather, it shows whether a visual element is represented as if it was true or not. The different levels of modality are obtained through so-called modality markers, or visual clues, that indicate how much we should trust the image. An extremely important point raised by social semiotics is the social construction of such modality markers. In other words, scholars have underlined how these visual clues “have arisen out of the interest of social groups who interact within the structures of power that define social life, and also interact across the systems produced by various groups within a society” (Kress and Van Leeuwen 1996, 155). The fact that what a social group considers real is culturally determined does not preclude the idea of realisms per se that will be, in turn, culturally determined. In this sense, the concept of realism has nothing to do with a factual correspondence between what is represented and the world. Rather, it is connected with the technological aspects of images production and hegemonic visual conventions. In our society, for example, the authors point out how photorealism is the “dominant standard” (Kress and Van Leeuwen 1996, 158) of realism. In photography, aspects such as color saturation, depth of field, amount of details contribute to the low or high modality of an image.

7 Iconography

Originally elaborated in the sixteenth century for the study of art, iconography was later developed and systematized in a three-level methodology for visual analysis by Erwin Panofsky (Müller 2011). The identification of visual motifs and interpretation of the meaning of visual products take place thought a three-step process: pre-iconographical description (or representational meaning according to the terminology used by van Leeuwen (2001); iconographical analysis (or iconographical symbolism) and iconological interpretation (or iconological symbolism). After a ‘neutral’ description of the represented elements, the second step is meant to identify typologies of images that share the same features. This categorization of images allows the researcher to recognize variances and resemblances that will – in the final step – be interpreted according to the wider social context. For the purpose of this study, the first and the second steps of analysis are particularly relevant in identifying visual motifs in the humanitarian discourse on the Syrian emergency and related migration crisis.

The first level, as with the denotation of visual semiotics and the representational meaning of visual social semiotics, refers to the description of the element of the image. Following Hermeren, van Leeuwen lists five ways to identify what is depicted: title or caption of the image; personal experience; background research; intertextuality, and verbal description. At the second level of analysis the represented participants – to continue with the terminology used by Kress and van Leeuwen (1996) – do not only denote the depicted individual/object, “but also the ideas or concepts attached to it” (Van Leeuwen 2001, 100). The attribute of iconicity refers to the resemblance of the image with the object that the image represents. In order to fully grasp the iconographical meaning of images it can be useful to keep in mind the distinction made by C.S. Peirce, one of the founders of semiotics, between icon, index and symbol (Peirce 1991). The first term refers to the similarity between the iconic sign and the object represented. Index is a sign clearly identifying this signified object. Symbols are images that conventionally (and therefore culturally specifically) establish a relationship between the representation and the object.

Although Panofsky initially elaborated the iconographical method in relation to art history, he recognized that the same pictorial conventions that connect concepts to artistic themes work in contemporary art. In the iconographical symbolism “there arose, identifiable by standardised appearance behaviour and attributes, the well-remembered types of the Vamp and the Straight Girl (perhaps the most convincing modern equivalents of the Medieval personifications of the Vices and Virtues), the Family Man and the Villain, the latter marked by a black moustache and a walking stick” (Panofsky quoted in Van Leeuwen 2001, 101).

Among the diverse images produced by humanitarian NGOs, iconography will allow the identification of a certain set of visual typification. The term, used by Kurasawa in an article on the iconography of humanitarian visuality, refers to a semiotic structure of images consisting of a relatively limited “system of formal relations between situational and compositional symbols serving to establish the roles of various actors (victims, perpetrators, aid workers, etc.) who are part of the visual composition of a scene of emergency or mass suffering” (Kurasawa 2015, 8). According to the author, the range of representations that are legitimate in a particular cultural, historical and socio-political context is limited, and its reiteration produces an “iconographic repertoire” of humanitarian images. The importance of visual conventions and repertoire lies in their being representative of a culturally and socio-historically situated system of thought, a way of representing the world that is shared by the practitioners who produce the visual material and works “as tacit referential or indexical social knowledge” (Kurasawa 2015, 20). The repertories play a pivotal role in the construction of the public discourse, setting the boundaries of how the people, situation and relations represented can be thought of and interpreted. Finally, the iconographical approach is even more interesting if we connect it with the argument of Heck and Schlag on the performativity of image in constituting an iconic act and how this understanding “directs our attention to the securitizing power of visual (re)presentations” (Heck and Schlag 2013, 896).

In his contribution on the political iconographic approach in the (Margolis and Pauwels 2011) (Eric Margolis and Luc Pauwels, eds., 2011), M.G. Müller suggests using previous literature on the research topic to identify typologies of visual motifs. For its specific attention to this aspect, iconography will be extremely useful to start identifying recurrent photographic patterns and attempt a first classification accordingly. Because of the massive diffusion of humanitarian images and their “iconic power” (Alexander et al. 2012; Kurasawa 2015) in the contemporary visual landscape, the literature is quite rich and provides numerous studies focusing on, inter alia, the iconography of suffering (Boltanski 1999; Chouliaraki 2013; Fehrenbach and Rodogno 2015); passivity (Nissinen 2015); personification, massification, rescue and care (Kurasawa 2015); piety (Shapiro 1988); emergency (Musarò 2017); humanitarian crisis (Campbell 2007); victimization (Friese 2017), or its opposite: resilient victim, and positive imaginary (Nissinen 2015), to cite but a few. In field humanitarian visual communication, iconography is also extremely useful to identify and describe the different iconological styles used by relief organizations. In the contemporary traditional and social media landscape, NGOs do compete at the visual level in order to distinguish themselves, their brand and their way of representing humanitarian issues (Kurasawa 2015). The identification of different iconographical approaches will be therefore extremely helpful in identifying the different organizational humanitarian narratives.

8 Photography, Power and ‘Claims of Truth’

In discussing the visual approach of this study, it is also important to briefly discuss the relevance of the specific features of different visual genres and particularly the epistemic-political claims of photography, the visual genre object of the analysis. Probably nobody has expressed the importance of genre of communication more effectively than the sociologist Marshall McLuhan when he affirmed that “the medium is the message” (McLuhan and Fiore 1967). In the world of visual art, most people consider photography as a very specific medium, often opposed to other popular visual genre such as paintings or movies. Victor Bürgin (1982) has noted how the public usually receives pictorial art and films as objects that need to be experienced in a critical way, whereas photography presents itself as part of the environment. Similarly, Susan Sontag (1973) has shown how photography is commonly perceived as a transparent method showing a piece of reality, while writing and paintings are instead associated with interpretation.

The importance of the visual genre in creating the message is even more striking if we think in terms of what Hansen (2011) calls “epistemic-political constitution”. Two aspects of this concept are particularly relevant for this study. The first has to do with the claim of what different genres do regarding their relationship with reality. In this sense, documentaries and photography are the two visual genres that derive their authority from their epistemic statement of truthfulness. The second level has to do with the degree to which different visual genre are expected to offer explicit political claims. It is not about an ontological political nature, but, rather, the expectation of the audience. In this respect, Hansen cites photojournalism and cartoons as the clearly political kind. However, the political claims of photography are probably more ambiguous.

It is important to discuss the complicated relation that this particular visual genre has with power and ideology. One of the most important aspects of photography is its relationship with reality. If all other forms of visual representation (such as pictorial art, sculpture or movies) represent, each in their respective peculiar way, an interpretation of the real world, photography is often thought of as the most objective way to catch reality (Sontag 1973). She clearly shows how photographic images have come to represent a miniature or reality, the testimony of a hidden truth, an instrument of knowledge, or a proof of reality (as proved by the fact that they have to be attached to some documents to make them valid). Similarly, Roland Barthes in Camera Lucida affirms that photography’s “power of authentication exceeds the power of representation” (Barthes 1981, 89). According to Claude Levi-Strauss, this is particularly true for news photographs that “function as indexical illustration of the stories that accompany”’ (Levi-Strauss quoted in Campbell 2007, 379).

Despite the ineluctability of aesthetics in any representation, pictures are commonly perceived as able to achieve an objective correspondence between the image and the referent object (Campbell 2007). It is exactly this assumption of photographs as ‘unmediated simulacrum’ that confers them so much authority in the field of “knowledge and truth” (Shapiro 1988, 124). Similarly, Annette Kuhn has pointed out how photographs imply authenticity and truth especially when what is represented through the lenses seems to be a credible surrogate of what we usually see. She notes that “the truth/authenticity potential of photography is tied in with the idea that seeing is believing. Photography draws on an ideology of the visible as evidence” (Kuhn 1885, 27).

This assumption of transparency confers photography authority and power. Since its inception, there have been two main, and yet opposite, perspectives regarding the relationship between photography and power. On one hand, photography has been praised for its ability to unveil and clarify. On the other hand – and this is the perspective that this study uses – it has been criticized for “its tendency to reproduce and reinforce the already-in-place ideological discourse vindicating entrenched systems of power and authority” (Shapiro 1988, 126). As Sontag (1973) contends in her seminal On Photography, to take a picture of something it is not only about appropriating what is represented, but also locating the image producer in a certain position toward the subject/object photographed, a position of knowledge and therefore power. Indeed, as she points out, photography has an inherently patronizing attitude toward reality resulting from its ambition to grasp the outside world and capture it through its lenses. The power relationship between the photographer and the subject can be looked at through different lenses: based on socio-economic class differences (Sontag 1973); neo-colonial approaches (Campbell 2007), or a gender perspective (Perna 2013). What is important here is that all these accounts share an acknowledgement that the photographic gaze presents itself as an objective eye, with “as if its perspective is universal” (ibid., 42).

For the same reasons, photography is also strictly linked with ideology. As Bürgin has argued: “The structure of representation – point of view and frame – is intimately implicated in the reproduction of ideology (the “frame of mind” of our “point-of-view”). More than any other textual system, the photograph presents itself as an offer you can’t refuse” (Bürgin 1982, 146). According to Shapiro, it is exactly the assumed truthfulness of photos that makes photography as a genre quintessentially ideological, obscuring the fact that “the real is forged over a period of time by the social, administrative, political, and other processes through which various interpretative practices become canonical, customary, and so thoroughly entangled with the very act of viewing they cease to be recognized as practices”’ (Shapiro 1988, 185). Against this background, critical security studies have explored the difficulties linked to the seemingly opposite potentiality of visuality to repress and emancipate (Andersen et al. 2014). Scholars have shown how different visual genres have been associated either with the reproduction of domination and repression – such as in the case of mainstream film – or as constituting some form of resistance, in the case for example of artistic photography. They warned that at the theoretical level visual genre cannot be attributed a priori to emancipation, repression or critique capacities. Agreeing with the polysemic nature of images already underlined by Barthes (1981), they concluded, questions around their liberation/oppression potentialities can be only be answered at the empirical level. The advantage of this approach is to remain open-minded and admit that photography could sometimes contribute to the reproduction of hegemonic discourses, while at other times problematizing accepted analysis (Shapiro 1988).

The point is indeed to unveil how photographic enactment can reproduce power relations (Campbell 2007). Following Peirce’s conceptualization of icon, index and symbols mentioned above, Campbell (2007) therefore suggests abandoning the understanding of documentary photography as icons and indexes, so as to fully acknowledge them as symbols. Or following Jean Baudrillard (1988), we could consider them as simulacra, that is to say the image’s simulations of reality. Rather than a flattened and miniaturized version of reality, photographic images tell us something about the images producer, who unveil him/herself “through the camera’s cropping of reality” (Sontag 1973, 95).

Drawing on this literature, the visual analysis of images examined in this book is inspired by a conceptualization of photography as a visual code, what Sontag defines as a “grammar and, even more importantly, an ethics of seeing” (Sontag 1973, 1). Photography is thus intended as “one signifying system among others in society which produces the ideological subject in the same movement in which they communicate their ostensible content” (Bürgin 1982, 153). To conclude, the point is therefore not to unveil the unfaithfulness of reality’s representation, but, rather, to focus on the ways in which people and situations are enacted through their photographic depictions.

9 Polysemy and the Possibility of Different Readings

Before concluding the discussion of the visual approach, it is important to address a crucial feature of visual images, their polysemic value. Barthes has reminded us that “all images are polysemous; they imply, underlying their signifiers, a ‘floating chain’ of signifieds, the reader able to choose some and ignore others” (Barthes 1977a, 38–39). There is little doubt that when looking at the same image two people could be struck by different aspects of the representation. In a Barthian example of the advertisement in France of pasta Panzani, the level of denotation is clear and its reading as a quality food item in a shopping bag is quite straightforward to its audience. But at the level of connotation, the same image lends itself to multiple meanings: sign of freshness (just returned from the market); domestic preparation of food; Italianicity, the “idea of a total culinary service” since everything needed for a meal seems to be there, and the evocation of pictorial still life (Barthes 1977b, 270–71). Polysemy should be therefore taken seriously into account in any visual analysis.

However, the various readings are somehow circumscribed as “in every society various techniques are developed intended to fix the floating chain of signified in such a way as to counter the terror of uncertain signs” (Barthes 1977a, 39). In the interpretation of images, therefore, the meanings circulating among a situated cultural milieu assume particular importance in limiting the various reading possibilities. In this sense, as Mitchell has maintained, “whatever the pictorial turn is, then, it should be clear that it is not a return to naïve mimesis, copy or correspondence theories of representation, or a renewed metaphysics of pictorial “presence”; it is rather a postlinguistic, postsemiotic rediscovery of the picture as a complex interplay between visuality, apparatus, institutions, discourse, bodies, and figurality” (Mitchell 1995, 4–5).

In choosing to interpret images through the specific methodology of visual social semiotics, I am also aware that this is just one of the ways through which an image can be read and its meaning unpacked. Far from implying that this is the only or the right way to analyse a visual artefact, I am suggesting that this particular perspective is worthy of exploration for two main reasons. First, beyond its attention to the wider cultural and social context, visual social semiotics’ interpretation of an image is based on the complex agglomeration of the multiple semiotic resources at play and the interplay of the different layers of meaning (i.e., representational, interactive, compositional). In each picture, this infinite possibility of combination works to reinforce or, on the contrary, weaken, a particular reading. Consequently, an image is analysed in its entirety and, in each case, the various layers of meaning and semiotic resources considered together can help point towards one specific reading. A reading, it goes without saying, that is situated in a geographically and historically specific cultural milieu. This is precisely the second and most important aspect. The analysis is based on a specific situatedness that is linked with my positionality as part of the Western contemporary audience, which is exactly the one to which the images which are the objects of this study are directed to. Since any reading is situated in a particular cultural milieu, as Barthes has noticed, cultural and social expectations are brought to the image. Being part of the same cultural milieu, which is the primary audience of the image, is therefore crucial to unpacking the various meanings that are possible in that specific culturally, geographically and historically situated public. Moreover, the visual analysis has been complemented by a multi-sited fieldwork, direct engagement and investigation of the images’ producers – the transnational humanitarian NGOs. In so doing images have been analysed keeping in mind a much wider set of contextual information. Even acknowledging that that interpretation is just one among the multiple possible, it is important as it unveils one of the meanings that a picture assumes in a certain cultural milieu in a given moment in time and space. Highlighting that specific interpretation does not mean providing a unilateral and deterministic meaning assignment, but, rather, to unpack a certain reading and stimulate discussion of the relevance that that reading has on a specific public.