Introduction

Everyday experience attests that by simply looking at a stranger, human perceivers can gather a wealth of information about the individual’s facial morphology and expression, direction of eye gaze, body shape and posture, way and direction of movement, and/or style of attire. On the basis of these visual features, rapid and far-reaching social impressions concerning a person’s aims and intentions, emotional states, and/or personality traits are habitually formed (see de Gelder 2006; Frischen et al. 2007; Hall and Bernieri 2001; Henderson et al. 2016; Johnson et al. 2014; Johnson and Shiffrar 2013; Macrae and Quadflieg 2010; Uleman and Saribay 2012). Though these impressions may not always be accurate, they are often consensual (Bonnefon et al. 2015; Kenny and Albright 1987; Todorov et al. 2015). In other words, different perceivers tend to agree in their spontaneous social impressions of strangers (Kenny et al. 1992).

Due to this intriguing consensus at zero acquaintance even erroneous impressions can be highly influential and, ultimately, affect people’s dating opportunities (Shepperd and Strathman 1989), employment and accommodation prospects (Cavico et al. 2012; Houkamau and Sibley 2015), electoral successes (Todorov et al. 2005; Weaver 2012), and/or criminal convictions (Eberhardt et al. 2006). But how are rapid social impressions formed, and which consequences do they have, when perceivers witness groups of strangers (cf. Alt et al. 2017; Haberman and Whitney 2007)? Are strangers seen in dyads, triads, or even larger social gatherings evaluated differently from strangers seen by themselves?

In support of this view, accumulating evidence suggests that perceivers of encounters involving multiple individuals tend to process them as unified perceptual and social events (e.g., Ding et al. 2017; Hafri et al. 2017; Papeo et al. 2017; Quadflieg et al. 2015; Walbrin et al. 2018). Accordingly, perceivers’ impressions of these events do not only entail inferences about each individual’s aims, states, or traits (e.g., Fiedler and Schenk 2001; Hess et al. 2016), but also about the relations and obligations between individuals (e.g., Burgoon et al. 1984; Mason et al. 2014). In other words, rapid impressions of interpersonal encounters frequently go beyond individual-based impressions by concerning the type and quality of people’s relationships (Bernieri et al. 1994; Quadflieg and Penton-Voak 2017).

Interestingly, so-called encounter-based impressions have been the focus of empirical investigations for almost 50 years. As early as in 1971, for example, the zoologist Desmond Morris set out to catalog common nonverbal actions between individuals that could convey the nature of their relationship to others Morris (1971). Within the same year, the sociologist Erving Goffman introduced the notion of ‘withness cues’ to describe nonverbal behavior between romantic partners that would signal their status as being ‘with’ one-another to attentive social perceivers (Goffman 1971). In 1989, finally, the psychologists Mark Costanzo and Dane Archer developed the first person perception task that probed perceivers’ impressions of other people’s social relations in a standardized manner. Based on these and similar studies, the perception, interpretation, and evaluation of other people’s encounters has become an active topic of inquiry that continues to attract scientific attention (e.g., Ding et al. 2017; Hafri et al. 2018; Papeo et al. 2017; Walbrin et al. 2018).

Alas, to the best of our knowledge, this well-established line of research has not yet prompted much systematic debate, reflection, or theorizing. Research on encounter-based impressions has rarely been summarized, for instance, in the form of review articles or book chapters, even though both types of treatise are frequently used to discuss research on individual-based impressions (e.g., Macrae and Quadflieg 2010; Johnson and Shiffrar 2013; Yovel and O’Toole 2016; Zebrowitz et al. 2013). Relatedly, studies on encounter-based impressions have seldom been linked to dedicated theoretical frameworks, even though studies on individual-based impressions regularly rely on them (e.g., Biesanz and Human 2010; Gifford 1994; Walker and Vetter 2016). Given these conceptual disparities in the impression formation literature, it may not be surprising to learn that the causes and consequences of encounter-based impressions still remain poorly understood.

Thus, in order to stimulate a more structured way of approaching the topic, the current paper proposes a new impression formation model. This model, termed the Integrative Model of Relational Impression Formation (IMRIF), deliberately focuses on the experience of individuals who witness other people’s encounters from a third-person perspective. Although it does not account for impressions influenced by eavesdropping, hearsay, or direct communication with others (but see but see Bacha-Trams et al. 2017; Grahe and Bernieri 1999), the model identifies critical elements that shape the nature and course of encounter-based impressions as prompted by watching other people. In doing so, the model draws heavily on prior impression formation theories that have examined on the link between basic processes of person perception and complex person inferences as briefly outlined below.

Traditional Impression Formation Theories

Systematic theorizing about the formation of rapid social impressions based on the visual analysis of other people is often traced back to the cognitive psychologist Brunswik (1956). Brunswik originally introduced the so-called Lens Model (LM) of environmental perception to explain how humans can accurately assess their physical environment. Social psychologists soon revised his model in order to study people’s assessments of their social environment (e.g., Heider 1958; Scherer 1982). According to this revised model, it is now widely assumed that accurate social impressions can arise whenever perceivers witness visual features of others that are truthful indicators of their aims, states, or traits. A stranger’s true level of social dominance, for instance, can be successfully inferred based on her or his style of gesturing (Gifford 1994).

Subsequent impression formation models, however, have often gone beyond the question of accuracy in social impressions. The Weighted-Average Model of Interpersonal Perception (WAM; Kenny 1991, 2004), for example, has looked at impression consensus, trying to understand why different perceivers may or may not agree in their impressions of others. In addition, the Ecological Theory of Social Perception (ETSP; McArthur and Baron 1983) has explored impression functionality, postulating that many impressions provide affordances on how to act towards others (e.g., impressions of cuteness tend to invite nurturing behavior; Zebrowitz 1997). Together, these complementary frameworks have highlighted that rapid social impressions tend to be characterized by (at least) three fascinating psychological properties, namely impression accuracy, impression consensus, and impression functionality.

Inspired by this realization, contemporary impression formation models have examined each of these properties in further detail. The Realistic Accuracy Model (RAM; Funder 1995, 2012), for instance, has established that impression accuracy requires a person to display an informative visual cue (cue validity) during a relevant person perception episode (cue availability) that is noticed by a perceiver (cue detection) and then adequately interpreted by her or him (cue utilization). Moreover, so-called Stage Models of Dispositional Inferences (SMDIs; Lieberman et al. 2002; Trope and Higgins 1993) have explained that impression consensus depends on perceivers’ iterative reasoning during cue utilization. Two perceivers may, for example, agree that another person is acting aggressively (i.e., they show consensus at the first stage of cue utilization), but attribute this behavior to different causes (i.e., they lack consensus at the second stage of cue utilization). The Theory of Interpersonal Sensitivity (TIS; Bernieri 2001), finally, has revisited the notion of impression functionality by exploring the link between people’s propensity to form accurate impressions of others and their ability to adequately interact with them (Bernieri 2001).

Yet, while doing so, the TIS has made an important discovery: Perceivers who tend to be accurate at judging other people’s traits are not necessarily accurate at judging other people’s emotions or interpersonal relations (cf. Bernieri 2001; Hall 2001, Hall et al. 2016). In other words, different types of social impressions do not seem to form a uniform psychological construct. Accordingly, the TIS has called on impression formation researchers to “make finer discriminations within the content domains of interpersonal perception” in order to facilitate better theorizing about the impression formation process (Bernieri 2001, p. 8). In response to this call, the current paper differentiates between individual-based impressions and encounter-based impressions and proposes a dedicated impression formation model for the latter.

We believe that such a model is timely for two main reasons: First, most impression formation theories as summarized above (with the exception of the TIS) were originally developed to capture individual-based impressions, primarily trait impressions. As such, they never actually tried to address the formation of encounter-based impressions. Second, even though many of the existing theories continue to impress in terms of their conceptual breadth, they lack explanatory depth. The ETSP, for example, once suggested that some impressions may be more prevalent when encountering strangers than others, but ultimately failed to specify which impressions this would be. In consequence, the utility of traditional impression formation theories is surprisingly limited when it comes to understanding or predicting the formation of encounter-based impressions. To overcome this state of affairs, this paper propose a new model that focuses primarily on defining and describing encounter-based impressions.

Defining Encounter-Based Impressions

Encounter-based impressions signify an impressive psychological feat as well as leap: Based on mere appearances and overt behaviors, perceivers draw far-reaching conclusions about other people’s social relations or obligations without directly getting to know them. In line with this broad definition, several different categories of encounter-based impressions have previously been discussed in the literature. The first category is prototypical for the domain and entails impressions of social attributes that typically arise at the level of the dyad. These impressions concern, for instance, other people’s type of acquaintance (e.g., whether two people are strangers, colleagues, friends, or lovers etc.; Barnes and Sternberg 1989; Costanzo and Archer 1989; Floyd 1999), the purpose of their exchange (e.g., whether they are primarily bonding or solving a task together?; Arioli et al. 2017; Canessa et al. 2012), or the quality of their involvement (e.g., people’s degree of coordination, collaboration, commitment, rapport, and/or intimacy; Bernieri et al. 1996; Bodie and Villaume 2008; Fawcett and Gredebäck 2013; Hall et al. 2009; Michael et al. 2016). In short, impressions within this category tend to capture various facets of a dyad’s overall cohesiveness or entitativity (see Lickel et al. 2000).

Impressions belonging to the second category, by contrast, typically concern the similarities and/or differences of those who constitute a dyad. The most prominent example in this category are impressions about other people’s power relations (e.g., as symmetric or asymmetric) that involve speculations of who holds more or less power (Mason et al. 2014; Schmid Mast and Hall 2004; Sternberg and Smith 1985). Further examples in this category are impressions about other people’s moral relations (involving speculations about who may count as a victim or a perpetrator; Gray et al. 2014) and/or agency relations (involving speculations about who tends to initiate or receive social actions; Hafri et al. 2018). Just like impressions in the first category, however, impressions in the second category rely on the simultaneous consideration of a particular combination of individuals. In other words, both types of impressions are inherently relational. This quality sets them apart from impressions belonging to the third category.

Impressions in this final category have in common that they generally refer to social characteristics that reside in individuals. Accordingly, these impressions are best thought of as individual-based impressions in disguise. They may be prompted by other people’s social encounters, but they do not strictly depend on them. A person’s level of neuroticism, for example, can certainly be judged upon seeing her/him around others (Carney et al. 2007), but it can also be judged based on the person’s facial appearance, style of attire, or non-social behavior (e.g., Penton-Voak et al. 2006). Because impressions of this kind are ultimately about individuals, they are generally less affected by whom exactly someone is seen with. Perceivers may realize, for example, that Jack is neurotic, regardless of whether he is seen around Lucy or John. By contrast, inferring that Jack and Lucy are siblings or that Jack is John’s boss (i.e., forming true encounter-based impressions) requires the observation of a specific person dyad.

To sum up, what makes encounter-based impressions a distinct domain of social cognition is the fact that these impressions go beyond speculations about the aims, states, or traits of mere individuals and result in the assessment of interpersonal relations or obligations. Unfortunately, at this point, it remains poorly understood under which conditions such impressions are most likely to arise. But the observation of other people’s social interactions seems particularly well-suited to elicit them, regardless whether these interactions contain conversational exchanges (also known as focused interactions) between people or mere behavioral adjustments (unfocused interactions; Goffman 1963). Yet inherently relational impressions can even be formed in the absence of (overt) social interactions between people. Similarities in people’s appearances (e.g., family resemblance, racial origin, dress etc.) may suffice to trigger speculations about other their relations or obligations (e.g., Rhodes and Chalik 2013). In consequence, encounter-based impressions can even arise in response to mere individuals if they prompt memories of other people. Perceivers of Jack, for example, may quickly infer that he must be related to their friend Lucy due to an uncanny facial resemblance between them.

The Psychological Properties of Encounter-Based Impressions

Although encounter-based impressions have been a topic of investigation since the 1970 s their psychological properties still await systematic investigation. Initial evidence indicates, however, that the accuracy, consensus, and functionality of encounter-based impressions can vary substantially. Impressions about other people’s types of relationships tend to be more accurate, for instance, than impressions about the quality of these relationships (Gray 2008). Specifically, most perceivers seem to be quite skilled at deciding whether two people know each other, are romantically interested in one another, or differ in terms of their status (Barnes and Sternberg 1989; Latif et al. 2014; Place et al. 2009; Schmid Mast and Hall 2004). But many struggle to correctly infer other people’s degree of rapport, liking, or love (e.g., Aloni and Biernieri 2004; Bernieri and Gillis 1995; Bernieri et al. 1996; Floyd and Erbert 2003). A seminal study by Bernieri et al. (1996) illustrates the latter claim: In this study, dyads of strangers were filmed while they engaged in a brief discussion. They were then asked to rate their level of rapport during this discussion before the recorded videos were shown (without sound) to an independent group of perceivers to collect further rapport ratings. This approach revealed little overlap between both sets of ratings, indicating that perceivers’ impressions did not match the discussants’ self-reports.

Besides these demonstrations of inaccuracy in encounter-based impressions, there is also a growing body of work highlighting their susceptibility to systematic bias. Encounters between people of similar physical appearance, for instance, tend to be judged more favorably than encounters between dissimilar counterparts. Romantic partners of similar physical attractiveness, in particular, are widely seen as having better relationships (e.g., deeper, happier, more balanced and cooperative) than dissimilar partners (Forgas 1993, 1995). Likewise, romantic partners of similar racial appearance elicit more positive evaluations than dissimilar partners (Skinner and Hudac 2017; Skinner and Rae 2018). Even the same interpersonal behavior (e.g., an ambiguous shove) tends to be evaluated more favorably (e.g., more playful, less aggressive) when it unfolds between two people who look racially alike rather than different (Duncan 1976). These findings suggest that social stereotypes and prejudice do not only affect individual-based impressions, but also impressions about other people’s social relations and obligations (cf. Field et al. 2013; Lalonde et al. 2007; Lewandowski and Jackson 2001).

Although the exact consequences of such bias in encounter-based impressions require further study, initial work highlights their potential legal and societal impact. Allegations of sexual harassment and domestic violence, for instance, tend to be considered more truthful and deserving of harsher punishment when they concern interracial rather than intraracial couples (Locke and Richman 1999; Maeder et al. 2012; Wuensch et al. 2002). Furthermore, couples coming across as less committed or in love on first glance (either due to their racial composition or other visual markers; Kleinke et al. 1974) may also be less likely to receive a mortgage offer or be considered suitable for adopting a child (Dalmage 2000; Lind and Lindgren 2016).

Research on common biases in encounter-based impressions also indicates that even inaccurate encounter-based impressions can be consensual. Moreover, consensual, yet inaccurate encounter-based impressions (such as judgments of rapport) have been observed to affect perceivers from different cultural backgrounds in a similar manner (e.g., Bernieri and Gillis 1995). Accordingly, it has been proposed that evolutionary pressures may have uniformly shaped the human tendency to form encounter-based impressions (Bryant et al. 2016; Pietraszewski et al. 2014). This idea has received additional support by studies showing that many encounter-based impressions affect and guide perceivers’ own social intentions and behavior in a functional manner.

Encounter-based impressions can, for instance, systematically influence people’s walking trajectories. Specifically, pedestrians tend to avoid walking into the physical space between people whom they consider to be a close social unit (Knowles 1973; Lyman and Scott 1967). In other words, encounter-based impressions seem to stop perceivers from penetrating territories that other people need to connect and communicate (Knowles 2015). Encounter-based impressions can also motivate selective social approach and avoidance behavior (Milinski 2016). White Americans, for instance, are less likely to reject a smiling Black stranger (and also fear his rejection less) when they encounter him in the company of a White friend rather than alone or in the company of a Black friend (Shapiro et al. 2011). Thus, by providing (alleged) reputational insights in terms of who shows positive (i.e., caring, cooperative, and/or protective) or negative (i.e., volatile, dangerous, unfair, and/or dismissive) social behavior towards whom, encounter-based impressions can shape perceivers’ own readiness to interact with those they observe.

In addition, alleged reputational insights based on encounter-based impressions can inform how threatening other people’s alliances are perceived to be with regard to one’s own social standing (Pietraszewski et al. 2014; Schmid Mast and Hall 2004). Open hostility and/or violence against interracial couples, for instance, is frequently motivated by a person’s proclaimed need to defend the ‘purity’ or ‘cohesion’ of their own racial group (Perry and Sutton 2008). Of course, perceivers may not necessarily respond with violence to a presumed social threat. Alternatively, they can also distance themselves from those whom they believe to cultivate the ‘wrong’ kind of social relations. Black Americans, for example, who assume that a same-race stranger has close White friends, tend to express less empathy for this stranger in an emergency situation than for an otherwise equivalent stranger with exclusively Black friends (Johnson and Ashburn-Nardo 2014).

Last but not least, encounter-based impressions seem to provide a pivotal opportunity for social learning. Young children, above all, spontaneously imitate actions they see in other people’s encounters (Shimpi et al. 2013). They also act more prosocially upon observing intimate and caring social moments in others (Over and Carpenter 2009) and use other people’s encounters to hone their own social preferences (Skinner et al. 2016). Upon witnessing that a person is treated either positively or negatively (e.g., smiled or scowled at) by another, for instance, preschoolers tend to adjust their own attitude towards that person as well as his/her alleged friends (Skinner et al. 2016). Similar learning effects have been reported in adult perceivers (e.g., Castelli et al. 2012; Willard et al. 2015), inspiring the idea that witnessing positive interracial encounters from a third-person perspective may succeed at reducing racist attitudes (e.g., Brown and Paterson 2016; Lemmer and Wagner 2015; Vezzali et al. 2014).

Though it remains unclear at this point under which processing conditions perceivers actually learn from other people’s encounters rather than withdraw from or aggress against them, the available data clearly indicate that encounter-based impressions are not just a diverting pastime. Instead, just like individual-based impressions, they are often highly consensual inferences that can directly influence perceivers’ own intentions and behavior towards others based on a wide range of accurate or inaccurate judgments about them. Given this realization, it seems particularly surprising that encounter-based impressions have not yet attracted the same level of scientific scrutiny as individual-based impressions. For example, to the best of our knowledge, there exists no theoretical framework to date that can describe, explain, and predict their varying psychological properties (i.e., their accuracy, consensus, and functionality) in a systematic manner. To overcome this lacuna, we decided to propose a new impression formation model.

The Integrative Model of Relational Impression Formation

The Integrative Model of Relational Impression Formation (IMRIF) aims to understand and predict the psychological properties of encounter-based impressions that concern the social relations and obligations between multiple individuals. It rests on the assumption that witnessing social encounters from a third-person perspective can prompt numerous impressions that are inherently relational in nature and, thus, go beyond those elicited by witnessing people in isolation (Fiske and Haslam 1996). Acknowledging the exceptional importance of dyadic relations in human sociability (cf. Balliet et al. 2017; James 1953; Kelley et al. 2003), the model focuses in particular on understanding encounter-based impressions that concern other people’s dyadic social relations and obligations. With this focus in mind, the IMRIF aims to identify pivotal psychological attributes that co-determine the formation of encounter-based impressions, including their accuracy (cf. LM, RAM, TIS; see Sternberg and Smith 1985), consensus (cf. GMIP, SMDI; see Bernieri and Gillis 1995), and functionality (cf. ETSP, TIS; see Castelli et al. 2012).

Relying heavily on the TIS (Bernieri 2001), the IMRIF postulates specifically that variance in the accuracy of encounter-based impression can be linked to four types of psychological attributes known as content attributes (i.e., attributes related to what an impression is about), target attributes (i.e., attributes related to whom an impression is about), perceiver attributes (i.e., attributes related to who forms an impression), and context attributes (i.e., attributes related to when and where an impression is formed). In going beyond the TIS, however, the IMRIF provides specific, evidence-based examples that highlight the impact of these attributes on the formation of encounter-based impressions and argues that these attributes do not only determine impression accuracy, but also impression consensus and functionality.

The Role of Content Attributes in Forming Encounter-Based Impressions

Encounter-based impressions, just like individual-based impressions, can entail a wide variety of social inferences. But whereas research on individual-based impressions has tried to identify and label important sub-domains of impression formation (such as trait impressions and emotion impressions; cf. Bernieri 2001; Ickes 1993), research on encounter-based impressions has been less systematic in its approach. Despite this oversight, it has been acknowledged that impressions concerning the type of other people’s relationships tend to be more accurate than impressions concerning the quality of these relationships (as discussed above). This acknowledgment confirms that variance in the psychological properties of encounter-based impressions (such as their accuracy) depends on so-called content attributes.

Content attributes are attributes of the impression formation process that are related to what an impression is about. Unfortunately, to date, the full scope and variety of encounter-based impressions remains to be discovered. Relatedly, it remains unclear how different kinds of encounter-based impressions vary in terms of their consensus and social functionality. Based on the WAM (Kenny 1991), however, encounter-based impressions which rely on widespread norms of social conduct (e.g., impressions of relationship fairness) would be expected to be more consensual than impressions that are less norm based (e.g., impressions of intimacy). Furthermore, based on the ETSP (McArthur and Baron 1983), encounter-based impressions which provide (alleged) insights into important social threats or opportunities (e.g., impressions of relationship purpose) should convey stronger social affordances than impressions that do not (Wojciszke et al. 2015).

There is also good reason to believe that variance in terms of impression functionality may not only be determined by what an impression is about, but also by what it feels like. Numerous studies indicate that spontaneous encounter-based impressions are often accompanied by rapid affective responses, ranging from anxiety, disgust, and eeriness (Neumeister et al. 2017; Quadflieg et al. 2016; Skinner and Hudac 2017; Vrtička et al. 2012) to admiration, enjoyment, and warmth (Hamilton and Meston 2017; Seibt et al. 2018). These affective responses seem to guide perceivers’ own social intentions and/or actions particularly strongly. Encounters eliciting admiration, for example, appear to be exceptionally well-suited to invite observational social learning (Fiske et al. 2017; Stellar et al. 2017).

In this context, it is important to note that perceivers’ affective responses towards other people’s encounters do not have to mirror their cognitive insights into these encounters’ affective qualities (as experienced by those involved in it). In other words, perceivers may be fully aware that they witnesses an inherently positive interpersonal event (such as a wedding) and can still experience a negative affective response (such as disgust) upon noticing that the couple in question entails people of different racial appearances (Skinner and Hudac 2017). Therefore, the IMRIF argues that distinguishing between perceivers’ cognitive and affective responses towards other people’s encounters may be of particular relevance when it comes to predicting their functionality. In support of this claim, it has recently been suggested that improvements in perceivers’ own racial attitudes upon witnessing other people’s positive interracial encounters may only occur when perceivers evaluate these encounters subjectively as positive (Mazziotta et al. 2011).

In summary, plenty of evidence indicates that the psychological properties of encounter-based impressions (i.e., their accuracy, consensus, and functionality) are directly affected by what these impressions are about. Nevertheless, a basic taxonomy that outlines the scope and variety of encounter-based impressions remains missing (cf. Quadflieg and Penton-Voak 2017). Though there have been some taxonomic proposals in the past (e.g., Burgoon and Hale 1981; Wish et al. 1976), they have primarily been based on assumptions of impression prevalence. According to Burgoon and Hale (1981, Burgoon et al. 1984), for instance, encounter-based impressions related to domination, intimacy, formality, and composure should be of major theoretical significance due to their popularity in daily life. It remains to be determined, however, whether prevalence-based taxonomies of encounter-based impressions can also advance our understanding of the variability of encounter-based impressions in terms of their psychological properties. The IMRIF, therefore, calls on contemporary researchers to revive efforts of establishing an evidence-based taxonomy that can describe encounter-based impressions with regard to their accuracy, consensus, and functionality.

The Role of Target Attributes in Forming Encounter-Based Impressions

Decades of research on individual-based impressions indicate that the visual information provided by some target individuals (e.g., their facial appearance or nonverbal behavior) prompts reliably more accurate, consensual, and functional impressions than the visual information provided by others (Funder 2012; Human and Biesanz 2013; Zebrowitz and Collins 1997). But does this observation generalize to encounter-based impressions? Partial evidence in favor of this assumption comes from work showing that unstructured social encounters (e.g., two people solving a puzzle together) invite more accurate encounter-based impressions than structured social encounters (e.g., two people introducing themselves to each other; Puccinelli et al. 2004). Yet why this may be the case, and how these two types of encounters tend to differ in terms of the visual information they provide, remains to be determined.

What seems uncontroversial at this point, however, is the fact that social encounters (just like individual targets) can vary substantially in terms of their visual target attributes (see Fig. 1). There is also reason to believe that some of these attributes are particularly influential when it comes to forming encounter-based impressions, including attributes related to people’s physical setting (e.g., whether their encounter takes place at a bus stop or in a kitchen; Plötner et al. 2016), their degree of interpersonal similarity (e.g., in terms of attractiveness, race etc.; Kernis and Wheeler 1981; Pryor et al. 2012), their level of direct communication (e.g., in the form of communicative gestures or speech-related facial movements; Leube et al. 2012; Manera et al. 2011), and their extent of nonverbal involvement (e.g., Burgoon et al. 1984; Thayer and Schiff,1974).

Fig. 1
figure 1

A photograph that illustrates how visual attributes of other people’s encounters can refer to their physical setting (i.e., a beach), their level of interpersonal similarity (e.g., both individuals seem to be of similar age and race, but of a different sex), their extent of direct communication (e.g., the man seems to show speech-related facial movements directed at the woman), and their degree of nonverbal involvement (e.g., both individuals are in close proximity to each other and display joint eye gaze). The photograph was downloaded from www.shutterstock.com and is reproduced in adherence with the company’s standard license terms of service (see http://www.shutterstock.com/licensing.mhtml)

Visual target attributes related to people’s nonverbal involvement are particularly manifold and can entail people’s physical proximity, their degree of eye contact, interpersonal touch, postural alignment (e.g., whether two people face and/or lean towards each other), facial alignment (e.g., whether two people show reciprocal or complementary expressions), motion synchrony, action coordination, and turn-taking (e.g., Bernieri et al. 1996; Burgoon 1991; Gallotti et al. 2017; Kimura and Daibo 2006; Kleinke et al. 1974; Lakens and Stel 2011; Latif et al. 2014; Lloyd and Morrison 2008; Michael et al. 2016; Neri et al. 2006; Schirmer et al. 2015; Thayer and Schiff 1974; Tiedens and Fragale 2003; Trout and Rosenfeld 1980). Alas, how this abundance of visual information is ultimately used by perceivers to form encounter-based impressions, and which information is particularly likely to trigger consensual, accurate, and/or functional impressions, is still uncertain.

To illustrate this question, Desmond Morris once wrote: “Take, for instance, the case of the fragile old lady being helped across the road by a young man […]. How can we tell whether the old lady is a complete stranger who solicited the young man’s aid, or whether she is his favourite (sic) aunt” (2002, p. 124)? In his attempt to address the question himself, Morris watched hundreds of people in public places and categorized their visual attributes into so-called contact tie signs (e.g., embraces and kisses), no-contact tie signs (e.g., shared gaze and postural echo), and symbolic tie-signs (e.g., wedding bands). Yet even Morris ultimately acknowledged that it is not the sheer presence of these signs that shapes perceivers’ impressions, but also how exactly they are executed and intertwined (see also Afifi and Johnson 1999; Bodie and Villaume 2008; Floyd 1999). With regards to the old lady he explained: “If the young man is a stranger he will probably take her (sic) arm, supporting it by grasping it under the elbow, and he will walk her across the street with a slight separation between his trunk and hers. If she is his aged aunt, she will probably take his (sic) arm, linking her hand through the crook of his elbow, and they will cross the road with close side-to-side contact” (p. 127).

Given these important subtleties in people’s interpersonal behaviors, the IMRIF argues that the psychological properties of encounter-based impressions depend strongly on the visual attributes that trigger them. Accordingly, the model urges researchers to study the visual attributes of human encounters primarily in terms of their psychological consequences (see Park 2014). It encourages, in particular, additional research that can help to identify visual attributes of human encounters that ‘just are’ (i.e., features; Bernieri et al. 1996), visual attributes that co-vary with specific social relations or obligations (i.e., correlates; e.g., Latif et al. 2014), visual attributes that are widely used by perceivers to form accurate impressions (i.e., cues; e.g., Barnes and Sternberg 1989), visual attributes that prompt uniform adaptive social actions (i.e., signals; e.g., Knowles 1973), visual attributes that invite consensual, yet inaccurate impressions (i.e., distractors; e.g., Bernieri et al. 1996) and, if applicable, visual attributes that prompt uniform, but non-adaptive social actions (i.e., decoys).

The Role of Perceiver Attributes in Forming Encounter-Based Impressions

There remains little doubt that even perceivers trying to form the exact same type of impression (e.g., a judgment of rapport) in response to the exact same encounter can come to rather different conclusions. Some perceivers, for example, seem to be reliably more accurate at deciphering other people’s social relations (e.g., Barnes and Sternberg 1989; Bernieri 2001; Costanzo and Archer 1989). This processing advantage has inspired numerous attempts to study the influence of so-called perceiver attributes on the formation of encounter-based impressions. The resulting body of work indicates that perceivers’ own interpersonal experiences and expectations play an influential role in determining the outcomes of encounter-based impressions.

Perceivers with attachment representations characterized by high avoidance (i.e., by a reduced desire to seek support from others), for instance, seem to evaluate other people’s positive encounters less favorably than those with low avoidance representations (Vrtička et al. 2012). The presumed link between perceivers’ attachment styles and their encounter-based impressions has even motivated the development of the so-called Adult Attachment Projective Picture System (AAP; George and West 2012). The AAP probes different attachment styles in adults using black-and-white drawings of ambiguous human encounters (such as a mother and a child sitting in bed, see Fig. 2). Perceivers’ verbal descriptions of these drawings are subsequently coded in order to detect attachment difficulties. As such, the AAP is one of the first tests that tries to use perceivers’ lack of consensus in encounter-based impressions to diagnose socio-cognitive deficits (George and Buchheim 2014; George and West 2011).

Fig. 2
figure 2

Example of a dyadic attachment picture from the Adult Attachment Projective Picture System (reprinted with permission from Buchheim et al. 2008)

Inspired by this line of research, numerous researchers have begun to scrutinize the effects of perceivers’ mental or physical health on encounter-based impressions by testing patients diagnosed with well-known psychiatric disorders (e.g., post-traumatic stress disorder, psychopathy; Decety et al. 2015; Moser et al. 2015; Neumeister et al. 2017), neurodevelopmental disorders (e.g., Fragile X syndrome, Williams syndrome; Riby and Hancock; 2008; Williams et al. 2013), or neurodegenerative disorders (e.g., amyotrophic lateral sclerosis, dementia; Cavallo et al. 2011a, b). Although many of these studies have been explorative in nature and require systematic replication, some have provided converging evidence that encounter-based impressions are less accurate and consensual in two psychiatric disorders: schizophrenia and autism spectrum disorder (ASD).

Perceivers with schizophrenia, specifically, seem less attentive than healthy controls to other people’s faces or body orientation when observing human encounters (Nikolaides et al. 2016; van’t Wout et al. 2009). In consequence, they appear to be less skilled at distinguishing whether two people engage in dependent or independent actions (Bakasch et al. 2013; Okruszek et al. 2015) and at grasping people’s interpersonal intentions (Andreau et al. 2015; Green et al. 2008). Similarly, research with ASD patients suggests that they are impaired at extracting visual markers of nonverbal involvement (such as contingent eye gaze, coordinated movements, or communicative gestures; Centelles et al. 2013; Klin et al. 2002; Riby and Hancock 2008; von der Lühe et al. 2016) and experience difficulties in understanding the social motives that guide other people’s interactions (Byrge et al. 2015).

Interestingly, these disorder-related deficits in encounter-based impressions seem to arise early in people’s lives (Centelles et al. 2013; O’Nions et al. 2014): When asked to describe human encounters, even children with ASD are more hesitant in using descriptions that focus on social relationships (such as ‘they are friends’) than typically developing children (Bauminger et al. 2004). Alas, at this point, such findings are hard to contextualize. Too little is known about the developmental trajectories of encounter-based impressions in typically developing children in the first place. It seems certain, however, that even very young children regularly monitor other people’s encounters. Before their first birthday, for example, healthy infants attend preferentially towards person dyads characterized by high rather than low nonverbal involvement (i.e., two people facing each other versus standing back-to-back; Augusti et al. 2010; Beier and Spelke 2012). By contrast, it is rather unclear at which age children form adult-like encounter-based impressions or undergo maturational changes in their ability to form encounter-based impressions (Over and Carpenter 2015).

Based on these and similar data, the IMRIF highlights the important role of perceiver attributes in forming encounter-based impressions. It highlights specifically that numerous perceiver attributes seem to affect both individual- and encounter-based impressions, including perceivers’ age and mental health (see above), but also perceivers’ gender, personality, social attitudes, and emotional state (cf. Bernieri and Gillis 1995; Costanzo and Archer 1989; Derlega et al. 1989; de Oliveira Laux et al. 2015; Forgas 1993, 1995; Hansen and Hansen 1988; Kammrath and Scholer 2011; Katsumi et al. 2017). In consequence, the model calls on contemporary researchers to establish which perceiver attributes (if any) affect encounter-based impressions more strongly than individual-based impressions or interact with content or target attributes of encounter-based impressions in a unique manner. Recent evidence indicates, for instance, that victims of interpersonal violence are particularly attentive towards nonverbal signs of aggression in other people’s encounters (cf. Neumeister et al. 2017).

The Role of Context Attributes in Forming Encounter-Based Impressions

By far the least studied attributes known to influence the psychological properties of encounter-based impressions are so-called context attributes. Context attributes capture the situational circumstances under which social impressions are formed. With regards to encounter-based impressions, three context attributes have received particular scientific attention in the past, namely exposure method, exposure duration, and vantage point. Exposure method refers to the medium through which perceivers happen to witness other people’s encounters (if any; see Fig. 3). In fact, besides direct, unmediated observations of human encounters (e.g., Sigall and Landy 1973), observations may also occur via one-way mirrors (e.g., Wright et al. 1997), video clips (e.g., Iacobini et al. 2004), photographs (e.g., Quadflieg et al. 2015), drawings (e.g., Schirmer et al. 2015), paintings (e.g., Villani et al. 2015), digital renderings (e.g., Katsumi et al. 2017), or point-light displays (e.g., Manera et al. 2011).

Fig. 3
figure 3

Examples of human encounters as used in recent impression formation studies, including a a standardized photograph (reprinted based on a CC-BY license from Wang and Quadlieg 2015), b a blackand-white drawing (reprinted with permission from Krämer et al. 2010), and c a point-light display (for illustrative purposes frames of points are shown superimposed on corresponding silhouettes; reprinted based on a CC-BY license from Manera et al. 2011)

This diversity of experiences has raised the question of how much the different methods of exposure may inadvertently dictate or confine the psychological nature of perceivers’ encounter-based impressions. Initial evidence indicates, after all, that perceivers’ spontaneous response to the different methods differs, even if their visual content is carefully matched (cf. Gillis et al. 1995; Grahe and Bernieri 1999). Perceivers are far more distressed, for instance, when they witness the exact same racist act (involving two people) on video rather than in person (Kawakami et al. 2009). Considering these data, further research is needed to learn whether and how various exposure methods affect perceivers’ abilities to form accurate, consensual, or functional encounter-based impressions.

Equally unclear is the effect of exposure duration on the impression formation process. So far, initial research suggests that perceivers can accurately detect up to three conspecifics in natural photographs within 50 ms (Railo et al. 2016), but require slightly more time (about 100 ms) to decide whether two people are facing each other (Dobel et al. 2007; Glanemann et al. 2016) and/or whether one person is acting upon another (Hafri et al. 2013). Even more time (approx. 200 ms) seems to be required in order to accurately read basic interpersonal intentions from natural photographs (e.g., to judge whether one person is accidentally or intentionally harming another; Hesse et al. 2016). Although these timings are likely to differ in absolute terms when people witness social encounters in an unmediated manner (rather than via photographs), they clearly suggest that different kinds of encounter-based impressions unfold on different time scales. Accordingly, some encounter-based impressions may only prove to be accurate, consensual, or functional if perceivers have sufficient time to form them.

Finally, research on the effects of vantage point has highlighted that the spontaneous formation of encounter-based impressions may also depend on perceivers’ visual perspective on other people’s encounters (cf. Cohn and Paczynski 2013; Cohn et al. 2017). Specifically, it has been argued that encounters in which the person initiating (rather than receiving) an action appears on the left side of a dyad may be easier to monitor than encounters in which this person appears on the right side of the dyad (Dobel et al. 2007). Thus, even though the exact same visual information would be accessible in both perspectives, one perspective could facilitate the formation of accurate, consensual, and/or functional encounter-based impressions due to fitting widespread prototypical event scripts (Chatterjee et al. 1999).

In short, based on the findings above, the IMRIF suggests that the important role of context attributes in forming encounter-based impressions requires further investigation, including how these attributes interact with the model’s remaining three attributes. The psychological effects of context attributes seem to be intricately linked, for example, to the effects of content attributes (e.g., with different exposure durations affecting different kinds of encounter-based impressions in a differential manner). Relatedly, the psychological effects of visual target attributes are often directly constrained by the effects of context attributes (e.g., with static methods of exposure limiting the availability of dynamic target attributes). In consequence, the interplay between the models’ various attributes deserves future scientific attention. With regards to this interplay, however, the IMRIF predicts at least one systematic pattern (see Fig. 4): Whereas visual target attributes are expected to serve as the starting point for the impression formation process, their effects on impression accuracy, consensus, and/or functionality are predicted to be modulated (i.e., moderated or mediated) by content, perceiver, and context attributes.

Fig. 4
figure 4

A graphic summary of the Integrative Model of Relational Impressions Formation (IMRIF)

Limitations of the Proposed Model

It is a widely shared assumption that good psychological models must be evidence-based, progressive, abstract, and applicable (Van Lange 2013). Guided by these evaluative standards, the IMRIF has tried (a) to base its claims on existing research, (b) to address gaps in the existing impression formation literature, (c) to propose four sets of abstract attributes as determinants of the psychological properties of encounter-based impressions, and d) to explain how an advanced understanding of encounter-based impressions could inspire important real-world applications, such as the creation of bespoke socio-cognitive assessments (e.g., George and West 2012) or the development of effective learning interventions (e.g., Mazziotta et al. 2011).

Nevertheless, the current model has numerous limitations. First and foremost, it is not (yet) a detailed process model. Even though the model identifies four major attributes that determine the impression formation process, it does not actually elucidate how exactly these attributes unfold their influence. To illustrate this point, consider our section on visual target attributes. Though we have highlighted and explained the importance of these attributes, we have not actually addressed one central question: How is the abundance of visual information that characterizes human encounters possibly integrated into a unified percept by uninvolved bystanders? Do perceivers, for example, compare visual input against stored templates of typical human encounters (cf. Dittrich 1993; Neri 2009; Papeo et al. 2017)? Similarly, we have not addressed whether and how automatic and deliberate cognitive processes may shape the formation of different types of encounter-based impressions (cf. Cracco et al. 2015).

Please note that our hesitation to do so does not reflect any doubts that understanding these processes is of pivotal importance for establishing an informative theoretical framework about encounter-based impressions. Rather, upon reviewing the available data, we realized that investigations regarding the perceptual, cognitive, and motivational processes of encounter-based impressions are particularly rare. Accordingly, we decided to refrain from covering certain aspects of psychological relevance at this point in order to present a largely evidence-based model. It goes without saying, however, that this model awaits further data-driven refinements and extensions.

We would further like to mention that the IMRIF has focused exclusively on the formation of encounter-based impressions in humans so far. But this is not to say that forming encounter-based impressions is a uniquely human ability. In fact, recent evidence indicates that even some animals, including rhesus monkeys (Machado et al. 2011; McFarland et al. 2013; Silwa and Freiwald 2017), baboons (Cheney et al. 2016; Kummer et al. 1974), and deer (Jennings et al. 2011) show an impressive ability to read their conspecifics’ encounters. As such, further research on the phenomenon’s phylogenic origins promise to provide additional insights into its underlying psychological mechanisms and potential evolutionary advantages. In short, being the first model of its kind, the IMRIF leaves plenty of room for improvements.

Conclusion

To conclude, although encounter-based impressions have been a topic of inquiry for decades, the resulting body of research has rarely been considered at large. Due to this oversight, the causes, consequences, and psychological properties of encounter-based impressions still remain poorly understood. Nevertheless, there is an undiminished scientific interest to explore the perception (e.g., Ding et al. 2017; Papeo et al. 2017), interpretation (e.g., Hafri et al. 2018; Walbrin et al. 2018), and evaluation (e.g., Hamilton and Meston 2017; Wang and Quadflieg 2015) of other people’s encounters. In light of these ongoing scientific efforts, the current article has develop a new conceptual framework based on existing empirical findings. Referred to as the IMRIF, this framework postulates that four main attributes (namely content, target, perceiver, and context attributes) jointly determine the accuracy, consensus, and social functionality of encounter-based impressions.

Despite its limitations, the IMRIF offers a comprehensive and integrative approach for the scientific study of encounter-based impressions. Above all, the model highlights the complex nature of encounter-based impressions and differentiates them from individual-based impression. In addition, the IMRIF provides a unifying theoretical framework that outlines several claims suitable for hypothesis-testing. It postulates, for instance, that the impact of visual target attributes on the impression formation process is regularly modulated by content, perceiver, and context attributes. Furthermore, the model allows researchers to embed their work on encounter-based impressions into a larger context and, thus, to probe the IMRIF’s utility for scientific inquiry beyond traditional impression formation theories.

We have little doubt, however, that the model will soon require further revisions based on new studies on the perceptual, cognitive, and motivational processes of encounter-based impressions. Such work may, for instance, elucidate whether the same visual target attributes are processed differently depending on which encounter-based impressions perceivers are trying to form (cf. Bernieri et al. 1996; Burgoon 1991), whether the formation of encounter-based impressions relies on the active simulation of multiple individuals (cf. Cracco et al. 2015), or whether computer vision could be used to increase accuracy in encounter-based impressions (e.g., Grammer et al. 1999; Sefidgar et al. 2015). As such, the IMRIF provides an exciting opportunity to re-consider what is, and what is not yet, known about watching and judging other people’s encounters.