1 Introduction

Information and communication technology (ICT) has revolutionized how people interact with each other irrespective of time and place. Over the last decade, computer-mediated communication (CMC) has exploded with various social media and online communication applications, which has allowed a variety of new forms of mediated social interplay with remote others. Virtual encounters between remote people are now commonplace (e.g., Bolton et al. 2013), ranging from plain text-based communication to those with rich multimedia content (Macskassy 2012) and even virtual reality simulations of face-to-face interaction (Schroeder 2002).

At the same time, various ICTs and personal technologiesFootnote 1 are often used collaboratively by several collocated people, during social encounters. Considering the groupware taxonomy by Ellis et al. (1991) and its two dimensions of time and place, collocated social interaction focuses on scenarios of ‘same time, same place’, that is, a synchronous interaction between individuals in close proximity. In the big picture of technology development, this area has attracted less interest than technologies for remote connectedness, and hence remains less explored and characterized. Only over the last two decades, have researchers and product developers begun to consider how technology could also support multi-user scenarios. Multi-player gaming consoles and collaborative touch displays are just some prominent examples of this trend (e.g., Falk and Björk 1999; Memarovic et al. 2015). In particular, several academic workshops have recently convened to discuss the directions for technology that would better support collocated interaction (e.g., Fischer et al. 2016; Jarusriboonchai et al. 2014b; Memarovic et al. 2012a).

1.1 Need for better technologies for collocated social interaction

Current technology can be argued to be suboptimal with respect to collocated social interaction. To motivate the need for reconsidering current technologies, we identify two broad wicked problems: (1) the use of current technology disrupting ongoing social situations, and (2) lack of social interaction in collocated situations where it would be desirable.

To elaborate the first problem, people tend to interact with various personal technologies in almost any social gathering. Frequent interaction with mobile devices has been noted to cause harmful social effects in situations where particularly familiar people are collocated (Turkle 2011). With a smartphone in hand, Turkle argues, we are only getting ‘sips’ of connection, not real communication. Notifications from mobile devices have been criticized for disrupting interactions in close relationships (Oduor et al. 2016). Ironically, it is often the interactions with remote others that disrupt those with collocated others. Furthermore, while such socially detrimental use of technology is often unintentional, the habit of snubbing someone in favor of a personal device results from intentional user behavior where interaction with technology takes preference over that with people. Often considered socially unacceptable, preventing such behavior has received attention in various public debates and campaigns, such as stop phubbing.Footnote 2

Taking a critical communication scientific perspective, the remote and asynchronous means of mediated communication can be considered as rudimentary simulations of the multi-modal and nuanced face-to-face encounters. CMC systems struggle to convey complex emotions as well as to satisfy important aspects of relatedness and belonging (Ryan and Deci 2000), for example intimacy (Hassenzahl et al. 2012). Computer-mediated communication and face-to-face conversations have been contrasted for decades, both in conceptual work (e.g., Baym 2015) and in empirical research (e.g., Bordia 1997; Kiesler et al. 1984). Early experimental research hinted differences in, e.g., the quality of communication performance, attitude change, and the evaluation of communication partner (Bordia 1997). Recent experiments hint that face-to-face interactions increase positive mood and satisfy social belongingness more effectively than computer-mediated interactions (Sacco and Ismail 2014). Misra et al. (2014) show that people who have conversations without mobile devices reported higher levels of connectedness and empathy than those who simultaneously use mobile devices. Moreover, Caplan (2003) argues that ‘online social interaction’ (i.e., CMC) can create feelings of isolation and emotional disconnection. Overall, while technologies for remote connectedness have enabled previously unimaginable social possibilities, their use seems to have introduced new social issues as side effects.

As for the second problem, there are numerous situations in everyday life where social interaction would be beneficial, emotionally pleasing or otherwise desirable, at the same time as non-existent or insufficient social interaction would be problematic. For example, in work places and schools, collaboration is desirable from productivity or educational perspectives (e.g., Alavi and Dillenbourg 2012). Management scientists argue that the success of innovation activities requires productive co-creation and propinquity between actors (Ramaswamy and Gouillart 2010). Within families or marital relationships, there is a need to maintain and strengthen emotional bonds (e.g., Epstein et al. 1993). Interaction with strangers can also provide emotional satisfaction and increase the sense of community in neighborhoods or within a state; however, while in some cultures or communities there are strong norms to engage in small talk or mingling, in some others people might lack the cultural acceptance or practices to do this.

From a societal viewpoint, lack of social interaction and social encounters can contribute to general trends of disengagement from various communities and non-participation that, for example, Robert Putnam provocatively discusses in Bowling Alone (Putnam 2000). Similarly, Turkle (2011) and Cacioppo (2009) underline phenomena related to increasing loneliness and a decreasing sense of community. Goffman (1963) problematizes civil inattention in people’s behavior in public places, which partly contributes to feelings of loneliness. To counteract such phenomena, earlier research has looked into facilitating chance encounters of strangers in public spaces to allow social serendipity and increase general social awareness (Rubin et al. 2011). Also, various community projects have been established to facilitate encounters and build bridges between different communities or cliques. For example, the human libraryFootnote 3 concept offers chances to meet with representatives of various minorities, hence building a positive framework to challenge stereotypes and prejudices. We consider the notion of non-interaction as a motivation to also stimulate collocated social interaction with technological solutions.

These two broad conundrums are only a few of the issues that motivate the design of better technologies for social situations where direct human-to-human interaction takes place. It remains an open challenge for the CSCW and HCI communities to understand how to design technologies that provide opportunities to bring people together, counteract some of the negative consequences of current technologies, or otherwise better cater for collocated social interaction.

1.2 Towards enhancement of collocated social interaction

The early research on collocated interaction seems to focus on enabling multi-device and multi-user interaction in group settings (Hinckley 2003) and supporting collaboration with interpersonal awareness devices (Holmquist et al. 1999). More recently, scholars have looked into, for example, interaction techniques that are suitable for group-based interactions with technology (Lucero et al. 2011). Overall, the fields of HCI and CSCW have a significant history of developing systems that involve multiple users interacting with shared interfaces and interactive spaces.

Interestingly, some of the prior research seems to adopt a stance on what we chose to term as enhancement of collocated social interaction. While the role of technology is conventionally restricted to passively mediating or enabling social interaction, a growing body of publications explore design concepts that play a new role of somehow improving the quality or extent of social interaction between collocated people. Prior research has especially contributed with various creative ideas and system prototypes. An example of seminal work is by Churchill et al. (2004) who deployed a conference information system with an intention to facilitate interaction among researchers and practitioners at the venue. Examples of more recent relevant prototypes include a wearable matchmaking device to initiate conversations between conference attendees by Chen and Abouzied (2016) and a multi-player mobile game for facilitating an icebreaking activity in small groups by Jarusriboonchai et al. (2016a). Choi et al. (2011) present a prototype for enriching face-to-face communication with a new channel of self-presentation, which seems to be another common vein of research. Also, ethnographic insights on people’s behavior in typical contexts for collocated interaction have been reported (e.g., Fosh et al. 2016; Porcheron et al. 2016).

However, to date, there are neither extensive overviews of relevant empirical work nor proper conceptualizations of this emergent topic. While a new research topic seems to be emerging, there is little understanding of its characteristics and intellectual enigmas. What are the key research and design problems, what kind of approaches are taken in the proposed solutions, how are the various contributions connected, and what research gaps exist? While the previous subsections highlight some relevant but general problems, the more specific problems that such technologies try—or should try—to solve remain uncharted. Without a proper overview of the history, it is challenging to build on the existing knowledge and design other appropriate yet novel contributions.

Furthermore, aside from general theories and concepts in social sciences, theoretical contributions that are specific to this topic seem to be rare. Lundgren et al. (2015) make an exception, introducing a framework for helping to redesign technology to better suit collocated users. Consequently, the design explorations are often based on a rather thin understanding of specific problems or characteristics related to social interaction, as we will demonstrate throughout this review. The lack of specific theorizations is also evident in the relatively obscure use of terminology across the prior work. While theoretical foundations can indeed be drawn from social sciences and communication sciences—e.g., Goffman (1963) or Sacks (1992)—the topic seems to lack the established vocabulary for more specific phenomena and concepts. Already, the fundamental term collocated interaction is rather broadly used. Some papers focus on several collocated people primarily interacting with a technological artifact, for example the proactive system by Ju et al. (2008). In such examples ‘collocated interaction’ might refer to human-computer interactions rather than human-human interactions. Others focus on collocated people interacting with an artefact as well as with each other, such as the TellTable by Cao et al. (2010). The direct human-human interactions and mediated human-human interactions often intermix with human-computer interactions. Similarly, the role of technology in collocated social interaction lacks consistency and clarity. Terms like ‘encourage’, ‘support’ or ‘trigger’ interaction are used to refer to the intended social influence of various systems. Some papers state encouraging social interaction as the aim but in reality, the concrete objectives or the presented design contributions are more modest; what is termed as encouragement may in fact be about providing a technological platform to interact. To clarify the terminological tapestry in this topic, Section 2 provides further conceptual background and the definitions based on which we initiated our review.

1.3 Goals and contributions of the paper

The primary goal of this work is to provide an account of proposed design solutions and prototypes in the emergent research topic of enhancing collocated social interaction with technology.

We argue that HCI, as a design-centric discipline, benefits from retrospective design critique. This helps to understand what the focus of past work was and where the design explorations failed or succeeded. We aim to identify general trends and research gaps in this topic and understand which problems have been addressed and what kind of perspectives and approaches have been taken in constructive design research. Secondly, we aim to provide an overview of the most central theoretical and conceptual work related to this inter-disciplinary topic. This helps us to unpack the concept of enhancement in this context and provide a more fine-grained vocabulary of related design objectives and approaches.

To concretize the primary goal, the objectives of our analysis were to understand and categorize:

  1. 1.

    the intended context of use and user groups of the proposed prototypes (for what and for whom to design);

  2. 2.

    the social design objectives or goals of the proposed prototypes (why design);

  3. 3.

    the design approaches to address the goals (how to design); and

  4. 4.

    the approaches to evaluate the proposed prototypes (how to study).

These objectives are based on our observations regarding a lack of clarity of the research landscape before starting the review; however, they are also viewpoints that HCI and CSCW as fields are generally interested in. Our iterative analysis process led to focusing on the questions of why and how to design as they proved to be particularly fruitful for analysis and follow-up theorization.

As for the secondary goal of outlining the theoretical landscape, Section 2 provides an overview of theories and concepts that we consider relevant for conceptualizing this topic and initiating a systematic review. However, the treatment of theory is purposefully limited by scope. The broader theoretical basis of such an interdisciplinary topic can become immensely extensive when covering social and behavioral sciences, pedagogy, management sciences, and so forth. Creating a comprehensive theoretical overview calls for more extensive writings after the boundaries of the topic have been outlined in this empirically oriented review.

This paper contributes a fresh perspective on the research area of collocated interaction, by reviewing designs that take an active role related to enhancing social interaction. As an empirical contribution, we provide an in-depth analysis of the focus areas, design objectives and approaches in the designs and follow-up evaluation studies of 92 publications. As theoretical contributions, first we synthesize a broad range of literature to conceptualize and theorize the topic, and second, we propose a hierarchical categorization and conceptualization of the various roles and design approaches related to enhancement. We believe the review helps researchers to analyze, describe and position relevant prior research, identify gaps in scientific knowledge, and design more appropriate technologies for collocated social interaction.

2 Definitions and perspectives

To provide a theoretical basis and tentative conceptualization of the research topic, we will next discuss the three aspects that we consider to primarily characterize it: collocation, social interaction, and enhancement. These serve as perspectives to help identify relevant research and analyze the gathered corpus. Furthermore, in 2.3 we provide an overview of earlier work that focuses on enabling rather than enhancing collocated social interaction.

2.1 Collocation and proxemics

The concept of collocation is defined by physical proximity between people. The research topic of collocated interaction thus addresses the synchronous and direct interaction between people who are in close proximity. Edward Hall’s concept of proxemics (Hall 1963) defines different levels of this issue. The public distance (approx. 4 m or more) is used for public speaking, the social distance (approx. 1-4 m) for interactions among acquaintances, the personal distance (approx. 0.5-1 m) for interactions among good friends or family, and the intimate distance (0–0.5 m) for embracing, touching or whispering. Hall further studied the culturally dependent use of space and how physical measures like distance, orientation, and posture mediate and comprehend interpersonal interactions. Ballendat et al. (2010) extend this theoretical space by further discussing the concept of proxemics, which refers to the spatial relationships in general, and the concept of proxemic interaction, which refers to devices with fine-grained knowledge of nearby people and other devices. The authors consider proximity not only from the viewpoint of position or distance but also of orientation, movement and identity. All in all, these frameworks help design technology that can identify which devices (or people) are collocated and to which degree.

For the purposes of this review, we base our understanding of collocation on the abovementioned definitions. In addition to covering the given distances and dimensions of proxemics, we argue that collocated interaction could consider one additional type of distance, the nearby distance of approx. 10-100 m. Technology could also play a role in motivating nearby people (for example not in the line of sight of each other but nevertheless in the vicinity) to get physically closer and, possibly, initiate interaction with each other. This is motivated by the fact that our corpus contains several so-called social proximity applications that explored this distance type with Bluetooth- and Wi-Fi-based connectivity (e.g., Persson et al. 2005). Furthermore, so-called People Nearby Applications like Tinder and Happn have become popular for romantic partnering purposes, aiming to bring people together (e.g., Hsiao and Dillahunt 2017). Many such systems build on the seminal proximity matchmaking device Lovegety, already introduced in Japan at the end of 90’s (Iwatani 1998).

From a language viewpoint, it is worth mentioning that there is a fair amount of confusion about the spelling of the word ‘collocated’.Footnote 4 While this dilemma is best left to language theorists, in this paper we consider the various versions as synonyms, and we use ‘collocated’ simply for consistency and do not argue for or against either of the versions.

2.2 Facets of social interaction

Our initial observations of publications on the topic hinted that the research contributions cover a variety of different types of interpersonal interaction that can be seen to come under collocated social interaction. Unfortunately, the terminology seems to be rather eclectic also in this respect: publications in the surveyed area are bristling with terms like communication, dialogue, face-to-face interaction, collaboration, cooperation, and co-creation, but typically leave their exact meanings undefined. While there are various definitions for these terms (e.g., Dillenbourg 1999) they are rarely used in the papers. We argue that understanding the richness of related concepts and phenomena with respect to social interaction is crucial when designing technology that aims to intervene it.

Social sciences research presents frameworks that help to conceptualize collocated social interaction for CSCW and HCI. Face-to-face interaction, according to Goffman (1959), refers to the reciprocal influence of individuals upon one another’s actions when in one another’s immediate physical presence. This covers, for example, intensive conversations between people familiar with each other, small talk and mingling with strangers, and joint embodied activities that reach beyond speaking, such as physical cooperation and affectionate actions. In other words, engaging in conversation can be argued to constitute a large portion of collocated social interactions but represents only a part of the spectrum. In the context of public spaces, Ludvigsen (2005) provides a conceptualization of what could be seen as levels of social interaction: from ‘distributed focus’ to ‘shared attention’, ‘dialogue’ and finally ‘collective action’ (for example, from multiple individuals engaging in their own activities to brief face-to-face encounters like a greeting, dyadic or group-based conversations, and joint play, respectively). Such interaction can take place between various numbers of actors: e.g., in dyads (one-to-one), in groups of various size, or as one-to-many broadcasting like in public speaking.

In Behavior in Public Places, Goffman (1963) drew a line between unfocused vs. focused gatherings. In focused gatherings ‘the participants are organized so that they are maintaining among themselves a jointly sustained focus of attention’ (e.g., a conversation), while in unfocused gatherings ‘no such focus can be discerned and the various participants are pursuing separate lines of concern’ (e.g., pedestrians on a city street). Following this thread, Ludvigsen (2005) also discusses layers of rules that define the interaction and its timespan from the viewpoint of technology. According to Ludvigsen, when applying Goffman’s concepts, the occasion is considered as the social construct of an event; what we already know or should know about the conduct at a given event (e.g., behavioral norms at a rock concert). The situation then is the specific manifestation of the occasion, and everyone entering the situation is accessible to the other participants in the situation. The encounter or the face-to-face engagement is the smallest unit of social interaction that consists of few people currently present in front of each other, focusing on a shared object or activity and defining situational norms that shape the interaction.

Overall, the breadth of social settings and the factors affecting social interaction become very extensive when considering various perspectives in social sciences. Social behavior in any of the aforementioned settings is shaped by various situational settings and practices, collective and perceived social norms (Lapinski and Rimal 2005), subtle nuances in non-verbal communication (Argyle 1972), and various structures and roles and the role strain (Goode 1960) they inflict in group situations. Technology might appropriately support some settings (e.g., giving a public speech to a large audience) but be a nuisance in others (e.g., in deep conversations in close relationships). To ground these various perspectives to HCI vocabulary, the factors constitute a key part of what is referred to as social context (Mantovani 1996; Jumisko-Pyykkö and Vainio 2012).

We initiated the review by considering ‘collocated social interaction’ as an umbrella term that refers to all kinds of purposeful interpersonal communication that takes place in close physical proximity between two or more persons, and we planned to elaborate the various facets of focused interactions and situations with a bottom-up categorization. We agree with Stromer-Galley (2004) who proposes that the interactivity between people and the interactivity between people and technology should be conceptually distinguished. Therefore, we use the shorter term ‘collocated interaction’ to refer to a broader spectrum of interactions in close proximity, both between people and technologies.

2.3 Enhancement: beyond technologies that enable

We have not found a proper definition for any specific term that covers the kind of influences that we aim to unearth in this paper. For initiating the review, we considered enhancement as an umbrella term that refers to the idea of:

technology not only enabling social interaction but taking an active role in deliberately attempting to improve its quality, value or extent.

Grudin states in his CSCW overview (2010) that, ‘digital technology is no longer confined to a support role; it is integral to many activities’. Being integral, we argue, affords an expansion of the roles and positions that technology can take, also in the context of interpersonal interaction. Benford et al. (2000) presents a relevant framework by discussing different approaches to the design of shared interfaces. In their paper about children exploring the possibilities of collaborating, the authors distinguish between approaches of enabling collaboration, subtle encouragement, and enforcement of collaboration (e.g., demanding that two children synchronize their actions in order to succeed). The notions of encouragement and enforcement relate to how we considered the concept of enhancement when starting the review: with technology not only enabling or allowing interaction to take place but taking an active role in enhancing it.

To further clarify the boundaries between enabling vs. enhancing, the following provides an overview of some traditional veins of research in HCI and CSCW that we consider as technologies for enabling interaction. Many conventional technologies provide a shared focus (or ‘shared attention’ as termed by Ludvigsen (2005)), which makes it possible for people to also engage in social interaction. This is a common approach in research on the collaborative use of interactive devices, such as interactive tabletops and surfaces. For example, Hinrichs and Carpendale (2011) present a public tabletop display with multi-touch interaction to be used simultaneously by several passersby; also other form factors have been explored (e.g., interactive floors by Graves Petersen et al. (2005)). Terrenghi et al. (2009) present an overview of such systems and contribute a taxonomy to consider when designing for multi-user shared displays. The shared output could provide relevant public information to collocated users and, as a result, the information could lead to users also interacting with each other. In addition, many primarily personal devices permit shared experiences, especially for leisure purposes like gaming (e.g., Szentgyorgyi et al. 2008). Pearson et al. (2015) explore the use of smartwatches as public displays, i.e., the social use of traditionally personal devices. Chong et al. (2014) present an extensive survey of establishing ad hoc virtual connections between several personal devices, which corroborates the existence of a vast body of prior research on such enabling technologies. Overall, while social interaction might result as a side effect of several users interacting with the same device, such technology does not explicitly and intentionally aim to enhance social interaction. Various interactive technologies allow for joint experiences and the development of local communities, as envisioned by Struppek (2006) in their review.

Another relevant line of work focuses on augmenting traditionally personal user experiences with social elements like cooperation or co-play. Here, social interplay is regarded as a tool for augmenting the experience, not necessarily the main goal. For example, Esbjörnsson et al. (2004) present a prototype for making the rather solitary activity of motorcycling more social by creating opportunities for encounters and providing subtle forms of interaction with other nearby motorcyclists. The solution supports people in meeting casually by providing a shared focus on an interesting novel technology. In the paper by Szymanski et al. (2008) enrichment of experience is not only for experiential purposes but also for a pragmatic purpose of enhancing learning. Finally, so-called locative media, which mixes realities and blurs the barrier between the physical and the virtual world, is often explored in order to augment people’s experiences in real places through relevant geo-tagged information (Bilandzic and Foth 2012). Pokemon GoFootnote 5 and IngressFootnote 6 are examples of publicly known manifestations of this, and recent research has indicated that such games can effectively inspire social encounters in public spaces (Paasovaara et al. 2017).

Now, moving towards enhancement, designs that we consider represent this idea share the premise that technology could deliberately, e.g., increase, intensify, encourage, trigger, or enrich collocated social interaction in such a way that it desirably affects the interaction setting or the involved person’s behavior. Different authors seem to use different terms to refer to such active stances of technology, depending on the specific intention or context of interaction. Our intention was not to survey the use of terms from a linguistics perspective but to analyze the different kinds of roles or positions related to social enhancement that have been proposed for technology in this context. For example, the social interaction setting could be amended from strongly technology-mediated human-human interaction to more direct human-human interaction where technology plays only a minor role in the background (e.g., Soute et al. 2010).

The aims of enhancement could be said to resemble the general idea of persuasive technology (Fogg 2002), which, broadly defined, refers to technology that is designed to change what people think and do. While some similarities between designing for persuasion and enhancing collocated interaction can be identified (e.g., the use of triggers), one cannot simply reduce enhancement of social interaction to persuading to talk. In this context, persuasion might take place temporarily and in a short-term perspective; for example, technology could provide a cue as to what to talk about (e.g., the wearable display by Falk and Björk (1999)) or suggest relevant social matches between collocated strangers and hence nudge the users to spark off a new encounter. However, the intended effect might not be long-term, which is the usual aim of persuasive technology. We regard enhancement of social interaction as a different design goal than turning one into a more extroverted person.

Finally, it is worth noting that non-technical solutions and human-based actions, rather than technological ones, might seem better suited for positively intervening in many practices of social interaction. Non-technical solutions could involve, for example, the definition of social policies or regulations, the introduction of spaces better suited for social interaction, consultancy and training, or human facilitation of group activities (for example, facilitated icebreaking games). This review focuses on the technology-based prototypes that researchers from technologically oriented disciplines envision to yield in positive social effects. Whether technology can meaningfully take such a role or not was a key question driving this research.

3 Review methodology

The review was an iterative process of identification, filtering, and analysis of the publications of interest. This section describes our approach to the systematic review, the inclusion criteria, and the main phases of the process.

3.1 Systematic review approach

The intended scope of publications was approached with two review methods: (1) systematically identifying relevant papers while browsing all the issues in the Journal of Computer-Supported Cooperative Work (CSCW) and all the conference proceedings of CSCW and European CSCW; (2) extending the resulting set of papers with other relevant papers that the authors were already aware of or found in other publication forums and by performing forward and backward citation analysis of key publications. Considering our primary goal, CSCW conferences and journals can be considered as central fora for constructive design research about information technology in social interaction. While communities in, e.g., social and behavioral sciences present relevant theoretical and ethnographical contributions, such disciplines rarely propose technological solutions or construct prototypes. Furthermore, as this paper is intended particularly for the CSCW community, we considered it important to focus on design solutions that match the quality criteria and methodological traditions of the community. This called for systematically going through the publications from the past three decades in the selected fora. However, due to the multi-disciplinarity and diversity in publishing practices, many relevant publications were known to be published in various other outlets (e.g., other HCI related conferences and journals), which implied extending the scope of our search.

It is noteworthy that we initially attempted keyword-based search as an alternative approach. However, if the vocabulary related to the target research topic is not well established, defining a valid list of search terms becomes very challenging. After failing to identify a comprehensive yet manageable list of keywords from 30 relevant publications familiar to the authors prior to the review, we concluded that a keyword-based search would not only omit a number of relevant publications but also produce an unmanageable amount of noise in the literature corpus. Moreover, earlier reviews have implied that keyword-based searches in multidisciplinary fields like HCI and CSCW are methodologically very challenging (e.g., Väänänen-Vainio-Mattila et al. 2015). In contrast, we argue that the review process that we employed produced a sufficiently extensive list of relevant publications to permit the analysis we were aiming at.

3.2 Review process

The whole review process was conducted in close collaboration by the six authors. We utilized researcher triangulation throughout the process: all the papers included in the final corpus were read through by at least two, often three, of us. This increased the accuracy of the inclusion/exclusion process as well as helped us avoid situations where someone would judge the relevance of their own papers. All the papers were tabulated in a shared spreadsheet, which helped with the accumulation of details and analysis throughout the process and allowed transparency regarding which papers were to be included or excluded.

3.2.1 Inclusion criteria

We defined the following criteria for the target scope of the publication corpus, based on the conceptualization presented in Section 2:

  1. 1.

    The paper was to be peer-reviewed, covering journal papers, full or short papers in conferences, and books or book chapters (excluding, e.g., contributions to workshops and poster papers).

  2. 2.

    The focus was to be on collocated social interaction. We included papers about bringing nearby people together into a face-to-face encounter. In other words, the work could not address primarily remote interaction.

  3. 3.

    The type of technology or intervention was to be about information technology. This implies that physical artifacts like board games and furniture, as well as non-technological services or art were excluded. The form factor, technical features or interaction techniques could be of any kind.

  4. 4.

    The paper was to propose one or several solutions: design concepts, mockups, prototypes or fully functional systems. Rare examples of ethnographical and theoretical work in this topic are covered in Sections 12.

  5. 5.

    Enhancing social interaction was required to be a focus of the paper. It was to incorporate a deliberate intention to enhance collocated interaction, regardless of the terminology used. Following our rationale in Section 2.2, all kinds of social interactions and encounters were considered.

The first four requirements could be relatively easily assessed by reading the papers. The last one left room for subjective interpretation, which we addressed through iterative discussion. As for the various facets of enhancement, we included all papers that we agreed matched our initial definition of enhancing collocated interaction.

3.2.2 Accumulation of the corpus

The selection process consisted of three main phases, after which a detailed analysis of the resulting papers was conducted, as described in Figure 1. The initial inclusion was carried out by browsing the titles, abstracts and keywords of the conference proceedings and journal issues of:

  • The Journal of CSCW (until Vol. 26, Issue 3, June 2017): Approx. 460 publications

  • CSCW (1986–2017): Approx. 1450 publications

  • ECSCW (1989–2017): Approx. 330 publications

Fig. 1
figure 1

Overview of the selection process of the corpus, aiming to identify papers that present a solution (prototype) for enhancing collocated social interaction. The numbers in the early phases of the process are rounded off to the closest ten.

We reviewed the title and abstract to define the initially perceived relevance on four levels (very relevant, relevant, somewhat relevant, irrelevant). This phase was carried out by four researchers in October 2015, and the more recent proceedings and issues were reviewed in June 2016 and in December 2017 while revising the manuscript, resulting in approximately 110 papers. In the first exclusion round each paper was read in more detail by another researcher to reassess the relevance. We included 39% of the listed publications for further analysis. The relatively low inclusion rate shows the importance of judging the papers not only based on the title, abstract and keywords. The majority of the excluded papers were, in contrast to our first impression, either about remote interaction, or the content of the paper did not address social interaction but, for example, technology exploration. For example, Kelly et al. (2017) conducted an ethnography about the types of effort people appreciate from significant others in direct face-to-face communication. While the ethnography could have inspired artefacts supporting collocated interaction, their design guidance and technological visions focus on remote communication.

The resulting list of papers was limited and many relevant papers we already knew were missing as they had been published outside the CSCW community. To enrich the corpus, we continued by opportunistically listing relevant papers we were already familiar with (e.g., having cited them, authored them, or otherwise encountered them during previous research activities and when planning for the review). Additionally, for approximately ten papers that were considered highly relevant we performed an opportunistic backward and forward citation analysis (i.e., browsing the papers they had cited and those that had cited the paper in question). However, this did not prove to be particularly fruitful because most of the papers in the corpus are fairly recent, thus were not cited a lot, and because many papers cited largely the same prior work. All in all, the collective opportunistic phase of the review effort resulted in 120 additional papers. This phase was carried out in February–June 2016 by six researchers and completed in December 2017 while preparing the final manuscript.

3.2.3 Analysis of the corpus

Finally, to analyze the resulting list of papers with respect to our research questions (second exclusion round), the papers were read thoroughly to reassess their relevance. Approximately 10% of the papers proved not to be relevant after all, after which the corpus consisted of 92 papers altogether. Furthermore, while reading the papers, the analysis scheme was refined. We started by defining a tentative analysis scheme based on the research questions and insights gathered so far. To test and refine the scheme and reconsider which aspects can actually be identified from the papers, we first analyzed a subset of 35 randomly selected papers. This was followed by a systematic analysis of all the remaining papers, which took place in April–December 2016 and was slightly revised based on the reviewers’ comments in December 2017.

The theories and concepts presented in Section 2 (e.g., Hall’s proximity distances or Ludvigsen’s levels of interaction) were originally considered to provide relevant frameworks for top-down analysis with predefined categories. However, the aforementioned analysis of 35 randomly selected papers indicated that such aspects could not be reliably inferred from most of the papers. Therefore, the analysis scheme as well as the categories in the analysis were formed based on bottom-up identification of common themes relevant to our research questions. The perspectives in the final analysis scheme include focus areas (context of use and user group), types of utilized technology, social design objectives, design approaches, and evaluation approaches. Considering for example the design objectives, the papers were not categorized according to what kind of design goals or objectives had been defined by the authors but according to a set of themes, ‘a vocabulary’, that emerged in the process. While no particular theories were used in forming the categories, our thinking was naturally affected by the extensive literature outlined in the previous sections. For all the perspectives, each paper was allocated to one or several categories, and the resulting labels were quantified across the publication corpus. In some cases, we could identify a paper to fit into several categories; for example, the target user group of a paper could be students on the one hand and event participants on the other.

4 Overview to the corpus and focus areas

Altogether 92 papers constitute the publication corpus analyzed in this review. The corpus is published as an open Mendeley library.Footnote 7

4.1 Bibliographical overview

The corpus portrays a broad spectrum of contributions over the last two decades. Looking closer at the temporal distribution of the works (see Figure 2), it can be observed that our review includes a few works from the early days of the HCI field, a considerable number of papers from the years 2005–2013 and a large amount of recent work. This confirms our observation of the recent increased interest in collocated interaction in general and particularly in the aspect of enhancement.

Fig. 2
figure 2

Timeline of the publication years of the papers in the corpus. Note that 2016 and 2017 are not fully covered by the review methodology, as reported in Section 3.

The relatively high number of publications indicates that structuring the knowledge in the field is indeed required. First, we investigate in which publication venues the corpus papers were published.

Table 1 presents the distribution of publication venues in the corpus. The CSCW conference is most prominently featured in our data set (partly due to the review methodology) with the CHI conference not far behind. This shows that collocated interaction is receiving attention in the core fora of the HCI area. At the same time, the distribution implies that there are no clearly preferred outlets for publishing work on collocated interaction. Many papers were spread among the proceedings of various specialized conferences (e.g., ACM GROUP; Tangible, Embedded, and Embodied Interaction (TEI); Mobile and Ubiquitous Multimedia (MUM); Ubicomp), meaning that the topic is approached by many disciplines and communities. Interestingly, the share of MobileHCI papers is rather high considering the comparably low publication volume of the conference series. However, we also note that the authors of the review are active in that community, which most likely produced a bias.

Table 1 The distribution of the publication venues of papers in the corpus.

Considering the types of contribution that the papers offer, by design of the review, the majority of the papers presents one or several prototypes and their evaluation. However, the corpus varies greatly with respect to level of detail of description and level of fidelity of a presented solution. The presented prototypes span from development of fully-fledged systems (e.g., Bluetooth-based systems like DigiDress (Persson et al. 2005)) to futuristic ideas on conceptual level (e.g., Mireya Silva et al. 2015)). For simplicity, in the following we refer to any such as prototypes.

Interestingly, only few papers included some sort of ethnography or studies of current interaction practices or user needs considering collocated social settings; most solutions are hence not guided by the authors’ own empirical research. We argue this is well in line with the share of different research contributions generally in HCI. Examples of insightful ethnographies that focus on this topic and do not rush to propose specific solutions are the work by Mayer et al. (2015) and Kytö and McGookin (2017) (not included in the analyzed corpus). The former explores how contextual information available on today’s mobile phones could be used to identify opportunities for people to make valuable new connections. The latter studied how people would create so called Digital Selfs and present them through augmented reality interfaces in multi-party interactions. In addition to the detailed analysis of people’s perceptions and expectations, they present guidance for the design of future multi-party digital augmentations. However, while such studies would direct towards more targeted and well-justified designs, listing all relevant ethnographical research is not in the focus of this paper.

Additionally, although collocated interaction could be seen as a wicked problem, we identified only few papers that explicitly employ design-based research that aim to produce knowledge through working on complex design problems in non-reductive ways (Bardzell et al. 2015). Notably, Taylor et al. (2007) conducted co-design workshops with families to understand the design of family photo displays. Reflecting on their process, they were able to unpack the design constraints involved in designing situated artefacts for the home. Another observation is that a number of papers provide only technical descriptions of systems or unimplemented concepts of future interaction. For example, Paay and Kjeldskov (2008) conducted extensive experience studies, but their design contribution consisted only of initial paper prototypes. These two observations imply that developing technical artefacts that aim to affect social interaction is challenging both in terms of technological enablers and as a design challenge.

4.2 Focus areas

This section reports the focus of the work presented in the papers, covering contexts of use and target user groups. We describe the trends in the corpus and offer examples of papers representative of our findings.

4.2.1 Contexts of use

The papers were categorized according to the context of use (or application area, as termed by some papers) of the proposed prototypes. It is noteworthy that most papers in the corpus do not specify a detailed focus context but, rather, many prototypes are intended as generic services to cater for a variety of physical, social and activity contexts. In fact, in more than a quarter (28/92) of the cases there was no specific context defined for the use of the prototype, i.e., it could be used in any context where encounters between people are feasible (e.g., Choi et al. 2011; Ko et al. 2016).

Public space (18/92), i.e., anywhere outdoors or indoors where people have free access, was the second most common category. Public spaces may enable different levels of sociability, ranging from passive to active engagement. This can be done, for example, using public displays, such as in Memarovic et al.,‘s work (Memarovic et al. 2012b) or interactive installations (e.g. Balestrini et al. 2016). As part of social context, such systems attempt to attract people to act publicly, in which case the sociability becomes visible. On the other hand, mobile applications used in public spaces may allow less explicit social activities such as sharing and augmenting places in a city (Paay and Kjeldskov 2008). Even though people are often uninterested in social activity in public spaces, these contexts can be rich with people, and researchers have seen opportunities to explore new forms of collocated social interaction.

The next two most frequent contexts of use were work place (12) and classroom or university (14). Prototypes for the work context often relate to knowledge sharing (e.g., Mencarini et al. 2012) or collaboration (e.g., Bødker and Christiansen 2006). Educational contexts may be bound to a shared display in a physical classroom (e.g., Dickey-Kurdziolek et al. 2010) or they may be used together with a mobile device, allowing a more ubiquitous support for collocated interactions in education (e.g., Kreitmayer et al. 2013). Furthermore, within these 26 examples of corporate or educational contexts there were six prototypes meant for organized events; particularly academic conferences have often been used as a target context (e.g., Chen and Abouzied 2016).

It is noteworthy that educational and working contexts are likely to generate opportune moments for social interactions but they are also quite familiar for researchers, which may increase their prevalence in the corpus. Nevertheless, for educational and corporate contexts, there is often a systemic interest to enhance interaction; while the pupils/employees might not be concerned, the teachers/managers might consider enhanced social interaction as an organizational goal that leads to better results.

In addition to the larger categories, there is a long tail of various other target contexts. These include leisurely events, such as music festivals, and parties (e.g., Jarusriboonchai et al. 2015a; Seeburger et al. 2012); exhibitions and installations (e.g., Aoki et al. 2002); home (e.g., Ballagas et al. 2013); specific task/activity context such as racing (e.g., Woźniak et al. 2015) or photography (Durrant et al. 2011); and specific type of outdoor place such as playground (e.g., Soute et al. 2010). This variety illustrates the extent of the possible foci and design spaces in this research topic: collocated social interaction is truly ubiquitous.

4.2.2 User groups

Following the trend of generic solutions, in half of the works in the corpus (46/92) the target user group could be considered as anyone. While many of these papers did not explicate the target user group, it could be presumed from the prototype description that the usage was open to many types of end users (e.g., Nguyen et al. 2015; Lucero et al. 2011). In cases where there was no specified target group, it may have been that the researchers believed that the system could be used by anyone, or that they had not thought of the target group as part of their design considerations.

The second most mentioned user group (14/92) was event participants or visitors. Such events were most often conferences (Borovoy et al. 1998), but also other events around specific activities such as sports, e.g., running (Mauriello et al. 2014). This relatively large number of papers addressing events may be attributed to the fact that when people go to an event, they are connected by similar interests and may be open for new contacts or increased social interaction.

Students and office workers were mentioned 13 and ten times, respectively, as the target user group for the developed solutions, which is in line with the frequency of workplaces and classrooms in the analysis about contexts of use. For example, Alavi and Dillenbourg (2012) developed a prototype for students and their tutors to improve their teamwork by making activity information visible to each other. Grasso and Meunier (2002) propose that office workers start talking about mutual interests after meeting each other around printers and seeing each other’s print jobs. These target groups may have similar motivations to connect face-to-face as event visitors, that is, common interests and activities. Also, these targets groups may be most familiar and accessible to researchers, which might partly explain their high presence in the corpus. Other, less frequent, target user groups include user groups focusing on special activities (e.g., coffee drinkers), or demographic groups such as children, families and elderly people.

4.3 Types of technology

The most common type of technology utilized in the proposed prototypes are off-the-shelf mobile devices (45/92). For example, Pass-them-around is a mobile application for photo sharing in a collocated group, using wireless personal area network for sending messages between devices, accelerometer for detecting interactions of one user, and radio tracking to detect other users’ positions (Lucero et al. 2011). ‘Who’s next?’ is an ice-breaking game that connects collocated players’ mobile phones as a group with Wi-Fi Direct, one device acting as a server and other as clients (Jarusriboonchai et al. 2016a). Before smartphones, personal digital assistants (PDAs) were used as a platform for prototypes. Pac-Man Must Die! is an early example of a collaborative game for PDAs. The players can freely join and leave a game session set up with wireless ad hoc peer-to-peer network (Sanneblad and Holmquist 2004). MultiDraw explores the potential of a single tablet in creative collaboration for a family context (Yuill et al. 2013). All in all, the personal nature, ubiquity and device-to-device connectivity technologies, especially Bluetooth, have rendered off-the-shelf mobile devices an optimal platform for experimenting and prototyping.

Another large category identified was interactive installations (27/92), referring to physically large interaction areas or installations, particularly for semi-public or public spaces. FunSquare was an application installed on city-wide array of public displays (Memarovic et al. 2012b). iFloor is an interactive floor installation for a public library aiming to create collaboration between collocated people. It projects a display on the floor with a ceiling-mounted projector, tracks people’s positions and movements from a web-camera feed to decide how the collaboratively controlled cursor should move, and further allows people to interact with others by sending SMS and e-mails to be projected on the floor (Krogh et al. 2004). Mood Squeezer consists of custom-made Squeeze Boxes, public input devices with six squeezable balls of different colors, digital floor displays and a web page. An aggregate output of all the squeezes by different users was mirrored on a custom-made floor display in the office building where Mood Squeezer was set up (Gallacher et al. 2015).

Third, wearables (11/92) include mostly custom-made devices with a focus on wearable displays. BubbleBadge is a wearable display created by detaching the display of a handheld game console, encasing it in a broach-like frame and reconnecting it back to the console (Falk and Björk 1999). CommonTies is a wristband where the display is simplified to a single LED that lights up when a Bluetooth connected base station detects a presence of a matching profile within its operation range. In addition to these, also commercial wearable devices such as smartwatches and Google Glass (Nguyen et al. 2015) have been utilized.

Fourth, desktop computers (9/92) have been utilized especially in solutions for work and education. StudioBRIDGE is a desktop application that extends an ordinary instant messaging application with information about the other students’ presence in the common study premises by locating them with the help of wireless signal strengths (Yee and Park 2005). Desktop computers have also served as essential parts of prototypes of novel collaborative input devices. Collective controllers is a set of two joysticks controlling the same computer application, where each user gets force feedback about the interactions of the other user (Graves Petersen et al. 2010).

Finally, 11 papers demonstrate strongly customized input and output devices. For example, Musical Embrace is a game controller in form of a pillow designed to be hugged by two users simultaneously. Inside the pillow, there is a Wii Balance Board detecting the pressure created by the users hugging the controller collaboratively (Huggard et al. 2013). Jokebox is a set of two wirelessly interconnected installation that invite people to push a button simultaneously on both devices to hear a joke as a reward (Balestrini et al. 2016). Finally, WAKEY is a system for children and their parents, consisting of a tablet computer app and interactive toys and tags (Chan et al. 2017).

5 Social objectives and design and evaluation approaches

Many of the papers in the corpus could be considered as technology explorations rather than development of solutions to well-defined problems. While some papers explicitly state detailed socially-oriented issues to solve, some do not explicate which social problems are addressed, or the definitions remain vague, or the defined problems primarily relate to other than social aspects. In some papers, the underlying problems are communicated only implicitly, requiring subjective interpretation from the analyzers. Consequently, instead of trying to dissect the problem space in the corpus, the following analysis focuses on the design space from two perspectives: understanding the explicit or implicit design objectives regarding social enhancement, and understanding the design approaches used to address those objectives.

5.1 Social design objectives

Based on our bottom-up analysis, we identified the various solutions to address a broad spectrum of design objectives that are socially oriented. For example, many solutions aim to increase awareness of other people in one’s surroundings (e.g., Falk and Björk 1999), which is fundamentally about influencing a social setting. To categorize each presented prototype, we interpreted them in relation to other solutions in the corpus, rather than labeling them based on the terms that the authors chose to use. We aimed to develop a categorization that presents concrete, specific, attainable and measurable design objectives. We believe that the analysis of objectives can inform and guide design activities in the future as well as shed light on what the very concept of enhancement may cover in this context.

Table 2 provides an overview of the identified design objective categories, the number of papers in each category, and examples of representative papers. The presentation order introduces a rough continuum (top-bottom) from moderate manifestations of enhancement towards more authoritative ones.

Table 2 Overview of the social design objectives drawn from the prototypes, ordered according to an approximate continuum from moderate manifestations of enhancement (top) towards more active and forceful forms of enhancement (bottom).

It is noteworthy that some papers fall into several categories; the 92 papers were labeled altogether 153 times. Some papers present several solutions, and some solutions can be interpreted aiming at several objectives. For example, the solution by Reitmaier et al. (2013) uses Bluetooth to communicate profiles of co-workers in a large organization. This was envisioned to help (1) gaining a better understanding of the organization and nearby colleagues, (2) identifying common topics of discussion between unfamiliar colleagues, and (3) enhancing ongoing conversations with new topics. When giving examples of the various solutions, we discuss them in relation to the primary objective we extracted from the paper.

5.1.1 Facilitating ongoing social situations

Facilitating ongoing social situations represents prototypes that support and nurture ongoing interactions, often focused on conversations. Much of such work is tied to professional collaboration and activities related to sharing tasks and knowledge (e.g., educational context, such as in Dickey-Kurdziolek et al. (2010)). In other words, this category partly represents a long tradition of CSCW research with an instrumental aim to ensure sufficient engagement or productivity in work contexts (e.g., Grudin and Poltrock 2003). However, the systems included in this category demonstrate a deliberate intention to improve ongoing encounters rather than merely making them possible. For example, Bergstrom and Karahalios (2007) focus on so-called social mirroring systems. Their ‘Conversation Clock’ aims to increase awareness of who is talking and when in a group, helping them to reflect on and balance the conversation dynamics. While this is also about increasing awareness (cf. subsection 5.1.5), the defined objectives underline facilitation of group conversations. Another paper with a similar system and design objective is by DiMicco et al. (2007).

Whereas the majority of the prototypes focus on professional or educational collaboration, also leisurely interactions and co-experiencing are addressed. For example, Fjeld et al. (2015) propose concepts for future tabletop technology to better support deliberation in public encounters. They envision tabletops that could encourage collaboration and engage users in socially relevant activities, such as political deliberation and civic participation. WAKEY (Chan et al. 2017) focuses on collaboration in domestic contexts, aiming to enhance, rather than replace, the parent-child communication and their intimate relationship. The system is intended to help parents think about the words they use to communicate with their children, giving them the opportunity to adjust their attitude when teaching the children about rules and manners.

5.1.2 Enriching means of social interaction

Enriching means of social interaction refers to adding new elements or channels into collocated interaction. This is close to the previous category, yet conceptually different: we consider a difference between prototypes that facilitate ongoing interaction practices and use of existing channels, and those that augment the possibilities or means for more intensive interaction. For example, Harry et al. (2012) aim to promote diverse participation and increase engagement in educational contexts. Their relatively simple text-based tool introduces a digital communication channel that enriches the communication space with a computer-mediated channel in collocated learning environments. In fact, providing such hybrid spaces and digital backchannels among school pupils is a common target for enriching means of participation (e.g., Du et al. 2012; Nelimarkka et al. 2014). Ballagas et al. (2013) provide an example of supporting shared experiences and promoting joint media engagement by amplifying traditional educational TV with cooperative augmented reality. The authors aim to explore the possibilities of technology supporting joint media engagement. Piper et al. (2013) present an interactive photo album for collocated photo sharing activities by using audio and digital pen. The prototype is meant particularly for senior citizens and, consequently, it enriches reminiscing and exchange of memories between generations. Finally, while Gugenheimer et al. (2017) do not propose any specific solutions, they study aids for sensory limitations in order to enable conversations between differently abled individuals like deaf and hearing people.

5.1.3 Supporting sense of community

Supporting sense of community refers to prototypes that focus on communities rather than small groups or dyadic relationships, and aim to foster community spirit and cohesion. For example, McCarthy et al. (2004) introduce what the authors call ‘proactive displays’ to augment the social space of an academic conference. Their prototype aims to enhance the feeling of community and support social practices at coffee breaks. While the prototype can also be seen both as a facilitator of ongoing social activities and as something that enriches them (cf. 5.1.1 and 5.1.2), its main intention is to enhance the social experience on a community level. StudioBridge (Yee and Park 2005) is an awareness system based on instant messaging and developed for students working in open studio spaces. While it fundamentally increases the students’ awareness of nearby people, groups and events, the defined design goals are strongly related to strengthening collaboration and a sense of community as well as initiating both online and offline interactions. Finally, Woźniak et al. (2015) explore the boundaries of the remote and the collocated by presenting a prototype for an ambient runner support system that enables on-site supporters to send three types of signals to runners during a race. Runners can send signals back to supporters. This way, the system extends the social support provided by the runners’ friends and families into the race day and supports community interaction on the site of a running race.

5.1.4 Breaking ice in new encounters

The term ‘icebreakers’ generally refers to tools that relieve tension, alleviate social awkwardness and support people in social skills, particularly in situations where new people gather together to start collaboration (West 1999). Icebreaking can be, however, also needed even when socially-oriented people meet for the first time, to help identify topics for discussion. The corpus features several examples of prototypes that take the role of an icebreaker. Jarusriboonchai et al. (2016a) report on Who’s Next, a mobile multiplayer quiz game that is particularly targeted to serve as an icebreaker. Based on user trials in authentic situations that would benefit from icebreakers, the authors conclude that technology can successfully facilitate the process of introductions in group gatherings and create a relaxed atmosphere between strangers. Kan et al. (2015) envision Social Textiles that reveal commonalities between two wearers of a similar shirt. While this kind of prototype falls also into other categories, the authors stressed their aim to support community organizers’ work in facilitating encounters between individuals unfamiliar to each other. FishPong (Yoon et al. 2004) is an interactive tabletop enabling multi-user interaction; however, the authors argue that it was designed as an icebreaker system that would encourage spontaneous social interaction in a coffeehouse.

5.1.5 Increasing awareness

The largest group of papers (37/92) focuses on increasing awareness of other people, their interests, or their actions. Awareness is a fundamental concept in CSCW and refers to knowing who are in one’s proximity, what activities are occurring, and what characteristics nearby people have considering the social opportunities they might provide; for recent conceptualizations and reviews, see Vyas et al. (2015), Tenenberg et al. (2016) and Lopez and Guerrero (2017). Many papers argue that revealing details about another person and the increased awareness it provides can result in better understanding and appreciation between people as well as motivate people to initiate interaction with each other. In other words, the prototypes provide social opportunities and tickets to talk (Sacks 1992) for people to possibly pounce on. For example, the BubbleBadge (Falk and Björk 1999) is one of the first explorations of a wearable display that publicly presents information about the wearer or the other user. This was envisioned to not only provide a new medium for self-presentation but also invite people to approach the wearer. Break-Time Barometer (Kirkham et al. 2013) is an awareness system that focuses on bringing a distributed group closer together and increasing joint informal face-to-face breaks.

In this category, most solutions seem to focus on self-disclosure (providing information about a person), rather than inviting others to interact (providing information for another person). As an example of the latter, Jarusriboonchai et al. (2016b) start from the premise that proliferation of personal mobile device UIs has resulted in a situation where those in our immediate surrounding have little information about our activities. This, in turn, has weakened the opportunities for shared experiences around digital activities. The presented Social Display automatically hints the collocated others about what the user is doing with their mobile phone with a simple visual cue. This was envisioned as a way to invite others to ask about the user’s actions. Similarly, MugShots (Kao and Schmandt 2015) is a coffee mug with a OLED display for photos and other visual content. It serves as an intimate communication device and can switch between public and private social interaction modes. While the authors designed the device to function as a social catalyst (Karahalios and Donath 2004) to trigger conversation when used in public or semi-public areas like office spaces, it serves as a new medium to present oneself, and thus provides a social affordance for other people.

5.1.6 Avoiding cocooning in social silos

Closely related to the increasing awareness category, seven papers aimed to help people avoid cocooning in social silos, that is, becoming isolated from collocated people, as problematized in the introduction. For example, Lock n’ LoL (Ko et al. 2016) aims to improve the quality of collocated social interactions by encouraging limiting smartphone use. It is described as a ‘mobile app that helps users to collaboratively manage their smartphone usage by providing synchronous social awareness’. While the previously mentioned Social Display (Jarusriboonchai et al. 2016b) increases the collocated people’s awareness of what a mobile phone user is doing, it can also be seen as something that helps avoid situations where a person engaged with their activity on a mobile device to create a private ‘cocoon’ or ‘mobile bubble’ (Lundgren and Torgersson 2013) in the ongoing social situation. Similarly, Aoki et al. (2002) and Gallacher et al. (2015) define their design problem as people being isolated from others. Gallacher et al. (ibid.) therefore designed a prototype to support sharing their current mood with nearby co-workers, which is fundamentally about increasing awareness of collocated others. In summary, avoiding cocooning approaches the aspect of lack of interaction by addressing the problem of becoming isolated, while increasing awareness approaches the same from a perspective of providing an opportunity to socialize. The slight difference is in the mindset: solving a problem vs. providing new social opportunities.

5.1.7 Revealing common ground

Some solutions take increasing awareness to the next level by revealing mutual interests, connections or knowledge, or other shared characteristics, effectively referring to the concept of common ground by Clark and Brennan (1991). This is particularly important in encounters with strangers: people form an understanding of each other’s interests and profiles through conversation and possibly disclosure of various personal details. Quickly identifying common ground helps build trust and sustain the conversation. As one of the first contributions, Eagle and Pentland (2005) presented their visions of systems that utilize Bluetooth and a database of user profiles to cue informal interactions between nearby users who do not know each other. Similarity between the user profiles would be measured to increase serendipity. CueSense (Jarusriboonchai et al. 2015a) is a more recent example of the same: the social media content on the wearable displays of two users are compared and only commonalities are shown when the users encounter each other. The authors considered this to provide a more mutually relevant ticket-to-talk (e.g., opening lines) than those based on only the other user’s profile. Compared to the prototypes about increasing awareness, this category emphasizes the importance of matchmaking: identifying mutually interesting or relevant content or commonalities between users.

5.1.8 Engaging people in collective activity

This objective is about engaging people in a joint activity that requires collaboration by several actors to be successful. It is about reaching a common goal and being positively interdependent of each other in the activity. In many papers we could identify a premise that the collective activity would lead to positive social encounters and in-depth social interaction alongside the activity. Particularly many games are good representatives of this category. In fact, Sanneblad and Holmquist (2004) call such games ‘collaborative games’ since several players are required to collaborate in order to succeed in the game. They present an overview paper about several game concepts that are one of the first examples of this approach: the need to collaborate is based on a limited number of output devices (mobile phones) or having several input/output devices to control simultaneously. Balestrini et al. (2016) and Yoon et al. (2004) consider play as collective activity but also as potential ice-breakers between unfamiliar people. JokeBox (Balestrini et al. 2016) encourages people to collaborate in a public place, requiring coordination and eye-contact, by the motivation of a joke as a reward. Krogh et al. (2004) present a floor display, iFloor, that reacts to the surrounding people and serves as a conversational icebreaker to stimulate discussion between the people around it. The content consists of questions and answers given by other visitors in the library where the display is situated. The design provides a new collocated medium for collective activity, and puts emphasis on its content and capability to actually trigger interactions.

5.1.9 Encouraging, incentivizing or triggering people to interact

This category is conceptually related to persuasive technologies (Fogg 2002) that aim to provide the user with motivation or incentive to socialize face-to-face. Such solutions provide reasons and motivations for interaction, rather than merely opportunities. Chen and Abouzied (2016) talk about gently nudging people to catalyze and sustain face-to-face interactions at conferences, rather than merely increasing awareness or providing user profiles. Their system, CommonTies, is a wearable accessory that does not expose any profile information but retains elements of ambiguity and mystery, which motivates users to explore the commonality that the system has identified through conversation. Soute et al. (2010) present another gentle way to encourage interaction by emphasizing the importance of natural and rich interaction. They discuss Heads-Up Games as a category of Pervasive Games that aim to encourage traditional face-to-face interaction rather than interaction mediated by information devices. Paasovaara et al. (2016) present Next2you, a proximity-based social mobile application that utilizes gamification, progressive disclosure of profile information and light-weight CMC interactions to encourage face-to-face interaction between familiar strangers (Milgram 1972). Unipad by Kreitmayer et al. (2013) aims to improve peer discussion and teacher involvement in classroom. While the prototype is also about enriching classroom interaction (cf. 5.1.2), it also serves as an encouragement to interact in new and more engaging ways. Finally, Jarusriboonchai et al. (2014a) present a simulation study of triggering a conversation between strangers with mobile devices that seem to be able to talk and, hence, possibly draw the participants in the conversation. This is one of the most forceful approaches in the corpus: it attributes technology with strongly social and autonomous characteristics and tries to mimic human behavior.

5.2 Design approaches

The diversity of design objectives led us to investigate if the approaches for addressing the objectives are equally diverse. This section analyzes the corpus with respect to the design approaches, design concepts, or ways of thinking identifiable in the papers. We contribute a categorization of high-level design approaches, related to which we present more concrete ones. 70 of the 92 papers were labeled in seven main categories. 13 papers represent approaches that are unique in this corpus, and thus were categorized as miscellaneous. Additionally, nine papers remain uncategorizable, mainly because of the presented prototype’s low level of maturity. Table 3 provides an overview of the categories and, the numbers of papers fitting in each category, and a few examples of each. The following subsections present one or several concrete design approaches or concepts for each category based on the themes interpreted from the papers.

Table 3 Approaches in designing interactive technology for collocated social interaction, with examples of papers or prototypes that have adopted each approach.

5.2.1 Shared digital workspace

This category is represented by two approaches that share characteristics but are slightly differently implemented. First, some papers in this category present multi-user surfaces like whiteboards, interactive paper, or tabletops to serve as shared interactive displays that provide a digital workspace. A shared workspace provides a rich resource for collocated social interaction within a group, contributing to awareness of others’ actions as well as concurrent interaction (Tang 1991). Accordingly, a shared display is one of the most common forms to employ ICT in collocated social interactions. For example, Stewart et al. (1999) introduced the concept of Single Display Groupware (SDG), which allows users to interact on a shared computer with multiple, simultaneous and equivalent input channels. TellTable (Cao et al. 2010) and Flashlight Jigsaw (Cao et al. 2008) are examples of interactive shared displays based on using a large screen as an interactive workspace.

Mobile phones are also used to implement the concept of shared workspace, even though their small size naturally limits the capacity for collective use, both in terms of input and output. Therefore, the idea of each user having their own device is often employed to extend the interaction space and provide more equal participation possibilities for everybody in a shared activity. For example, Lucero et al. (2011) created a shared display from an array of multiple mobile devices to facilitate photo sharing in small groups. Alternatively, Cowan et al. (2011) suggest using a projector phone to create a shared display. They argue that a single mobile projector phone can facilitate spontaneous sharing within a group. Another example, PicoTales (Robinson et al. 2012) is a storytelling application that is based on multiple mobile projector phones to project different users’ parts of a story on physical open space, which enables sharing and co-creation between users. Furthermore, mobile phones and a large display can be used in combination to create an even more flexible interaction space for people to collaborative use (e.g., Lucero et al. 2012). Overall, while this approach can be seen as very common amongst systems enabling collocated interaction, the same approach is used also in systems that manifest some form of enhancement.

Second, some prototypes aim to enhance interactions in a shared activity by adopting the What-You-See-Is-What-I-See (WYSIWIS) principle. WYSIWIS allows all group members to see the same things on their personal devices (Stefik et al. 1987); it is about synchronizing multiple single-user devices. For example, MobiPhos (Clawson et al. 2008) is a mobile application for a small group of collocated users, sharing a photo automatically after it is taken within the group. This type of synchronization seems to be common in prototypes that are intentionally designed for a tightly coupled style of interaction. Alternatively, some prototypes support loosely coupled styles of interaction by allowing users to work on their own in parallel and only share summaries or overviews of their actions.

To summarize, as Yuill and Rogers (2012) have pointed out, important aspects for interaction and collaboration include common frame of reference, shared attention, awareness of others’ actions, and access to information. The concepts of shared display and WYSIWIS have been mainly utilized to create shared device configurations that provide exactly the aforementioned qualities in a shared collocated activity.

5.2.2 Disclosing information about others

Thirteen prototypes in the corpus focus on providing information about others in the surroundings, in order to make people more aware of each other. The prototypes in our corpus disclose a variety of types of information: identity, user profile, shared content, user activities, earlier encounters, etc. IPAD (Holmquist et al. 1999) is an early example of a mobile awareness system that notifies its users when other users are close by, sharing location information between users. DigiDress (Persson et al. 2005) is an example of a profile-based mobile application. Users can browse the profiles of other users who are in proximity within Bluetooth signal range. It was found to create curiosity between most users and also trigger interaction between some. Jabberwockies (Paulos and Goodman 2004) focuses on information about possible social ties by letting the users know which nearby strangers they have encountered before (i.e., familiar strangers, coined by Milgram (1972)). This information awareness is intended to trigger interaction between the two encountered users and increase the sense of community. Sotto Voce (Aoki et al. 2002) is an example of a museum audio guidebook that allows users to eavesdrop audio content of their partners whenever they want as they traverse around the museum together. The eavesdropping provides awareness information about another user’s activity, which is designed to encourage users to develop a conversation with each other during the museum visit.

While many other design approaches seem to be used to address different design objectives, this approach seems to address only the objectives of increasing awareness, avoiding cocooning in social silos and revealing common ground. However, this also invites people in collective activity. For example, Walky (Nazzi and Sokoler 2011) applies microblogging to a mundane walking activity and tell others in a community when someone is going out for a walk, indirectly inviting others to join. Furthermore, C3C display (McCarthy et al. 2008) is a public display installed in a workplace, presenting photos that users have in their online gallery and social media when they are close to the display. Findings from the C3C display study imply that the display not only increases awareness and interaction between colleagues but also helps improve relationship in long term.

5.2.3 Introducing constraints

This category is about prototypes that utilize constraints in human-computer interaction in order to regulate interactions or guide users in how to enact in an activity. The constraints are designed with an intention to foster social interaction alongside a collaborative activity. In fact, designs of digital games have been regulating users’ actions and enforcing collaboration for some time already (Salen and Zimmerman 2004). For example, a game might require collaboration by limiting the functions available for each player. Related to this, Björk and Holopainen (2005) describe the gameplay design pattern of asymmetry, that is, differences between the players in terms of how the game behaves. Asymmetry takes two main forms: asymmetric abilities and asymmetric information.

Asymmetric abilities refer to providing users with different abilities to interact with the game or other artefact. For example, Flashlight Jigsaw (Yuill and Rogers 2012; Cao et al. 2008) is a multiplayer jigsaw game on a large shared wall display. Every player has a controller that is used to reveal pieces. Some jigsaw pieces can only be revealed by a certain controller, and some become visible only when using two controllers together. This asymmetry enforces players to interact and collaborate with each other in order to succeed in the game. Another example of the same design approach is Electric Agents (Ballagas et al. 2013). The players collect words and photos through a mobile augmented reality interface: one player collects words and the other pictures, and the words and pictures should match.

Asymmetric information refers to users having different access to information. For example, Who’s Next (Jarusriboonchai et al. 2016a) is an example of interactive technology that adopts the concept of asymmetric information in its design. It is a multiplayer quiz game to be played between newly met group of strangers. Personal information about the players’ backgrounds, hobbies, preferences etc. is used as the content of the quiz, thus relying on the inherent asymmetry between people who do not know each other. Players should guess who has given a specific answer to a question to earn points, and this is envisioned to break the ice in a group of strangers.

Collective interaction is another approach where interaction with technology is intentionally designed to be difficult when alone, hence encouraging users to cooperate (Krogh and Petersen 2010). For example, the previously presented iFloor (Graves Petersen et al. 2005) has a single cursor as the only interface that users can collectively use to navigate through the content on the floor. Movements of a user around the iFloor will draw the cursor toward him/her. The design requires multiple users to work together to move the cursor to a target area. Music Embrace (Huggard et al. 2013) is a digital game where an avatar is controlled with a pillow controller hanging from the ceiling by applying pressure to it. Again, two players are required to cooperate to navigate through the game. Finally, Yee et al. (2012) present a tangible gaming interface that expects co-attentive interactions among players: several players may freely attach their handheld game controllers, thereby creating a flexible collective and a transformable tangible interface. Overall, collective interaction is similar to the concept of asymmetric abilities but there are differences in how they are implemented. Asymmetric abilities put the emphasis on different users having different interaction abilities, while collective interaction focuses on multiple users cooperating with a common controller. In other words, collective interaction normally provides a single user interface while asymmetry provides several different ones.

5.2.4 Matchmaking

Matchmaking builds on the approach of disclosing information about others, particularly forms of digital profiles that exist about the users. While the disclosed information about other people lets the information receivers to assess the information and possibly act on it, matchmaking systems aim to filter and process the information for the users and reveal only mutually relevant information. Consequently, this approach mainly addresses the goal of revealing common ground. For example, a prototype called Serendipity (Eagle and Pentland 2005) matches users in proximity based on their profiles on a central server, and users receive the profile information if there are similarities between the profiles. CommonTies (Chen and Abouzied 2016), as discussed earlier, is a matchmaking system in the form of a wearable device. Although a user profile is required, the prototype utilizes a single glowing LED on the wearable device as an ambiguous signal to notify its users that there is some kind of match between them. Interestingly, the user profiles or commonalities between them are not disclosed but the idea is to encourage the users to explore the mutual profile elements when they meet. Here, matchmaking takes place only after the users have encountered each other. Another example of this is Social Textile (Kan et al. 2015), a wearable mobile device that reveals commonalities between two users after a social greeting through skin contact, such as a handshake.

5.2.5 Open space for shared activity

Eight prototypes are about offering an open interactive space where people can freely participate in a shared activity—a space that does not belong to certain people or is meant for certain people, but allows everybody to easily opt-in and participate. Opinionizer (Brignull and Rogers 2003) is an example of a public display placed in a social gathering. People can add their opinions to a certain topic, and they are visible to others around the display. The system is designed so that people can observe others using the system and opt-in to participate in sharing their opinions themselves without feeling pressured. FunSquare (Memarovic et al. 2012b) is a public display installment presenting local fun facts and a quiz game related to the facts. FunSquare creates social triangulation, that is, external stimuli in physical space that motivate strangers to interact with each other (Whyte 2001).

A few of the prototypes in this category adopt a more proactive stance; they do not wait for people to opt in and participate unprompted but offer the interaction possibilities straight to the people. For example, FishPong (Yoon et al. 2004) is a tabletop multiplayer ball-and-paddle style video game with coffee mugs as tangible controllers. The idea of the game is similar to Pong, that is, to keep the game content from falling off the edge. The game starts automatically as a mug touches the table, and the game is controlled with the mugs. The game is designed with an ambient interface that only subtly attracts attention and thus leaves space for social interaction. While it provides a social opportunity directly by making the mug interactive, it does not force the mug users to play the game.

5.2.6 Self-expression

Another category is using technology as a form of identification and self-expression in social situations. For example, Meme Tag (Borovoy et al. 1998) is an early example of digital nametag for professional conferences. The tag contains name and affiliation of the user and their favorite quotes. The tag automatically exchanges the content between users if they come close to each other. These systems might not limit the content to only a pre-defined form of user profile but also offer freedom to use it as a medium to express themselves. For example, BubbleBadge (Falk and Björk 1999) is a small wearable display attached to user’s cloth like a badge. It displays dynamic information based on what the users want to show, thus providing a digital channel through which to present oneself to surrounding people. In MugShots (Kao and Schmandt 2015), users can select images to be displayed on the mug. It is considered as a tool for communicating something about oneself to others encountered particularly in office environments.

5.2.7 Topic suggestions

A few prototypes offer conversation topics to facilitate and sustain ongoing face-to-face encounters and conversations. AgentSalon (Sumi and Mase 2001) is a system designed to facilitate the exchange of knowledge and experience between strangers who visit a museum. A shared display is used as a gathering point for knowledge exchange. It offers each user with animated agents on the display which act as helpers to assist visitors in their conversation. The agents suggest topics and information to users based on the experience users have had in the museum visit collected from users’ digital guide. As a more recent piece, Nguyen et al. (2015) present a system that generates real time topic suggestions during a conversation between strangers via Google Glass. The algorithm behind the suggestions is based on analysis of mutual interests in the users’ personal information or LinkedIn profiles (i.e., also about matchmaking).

5.2.8 Miscellaneous

In addition to the categories with several representative designs, it is relevant to underline the fairly long tail in the distribution of other approaches. The following approaches are unique in this corpus. MultiDraw (Yuill et al. 2013) is a drawing application for a small group of users to draw pictures together. Unlike in collective use of devices, each multidraw user has their own device and users take turn drawing different parts of a picture. After a user finishes their part, the device is passed to the person next to them, and they start drawing another part. While users are not synchronously working on the same thing, the application offers an alternative approach for collaboration in creative work. Piper et al. (2013) present a digital pen to enhance paper photo album with audio output. Tapping the pen on a certain photo or part of a photo album would play previously recorded audio. The digital pen was found to be engaging, especially for elderly users as a part of storytelling. Finally, in contrast to many other approaches, Park et al. (2017) address the problem of mobile notifications disrupting ongoing social interactions with a context-aware notification management system. Based on a preliminary experiment, the system was found to improve the quality of interactions by delivering notifications at appropriate breakpoints that naturally occur during social encounters.

5.3 Evaluation approaches

As the final perspective in the analysis, we categorized the prototype papers according to the user evaluation approaches reported in the papers. We wanted to understand if and how the presented prototypes and their impact had been assessed, especially in terms of the social aspects of the studied technologies. Firstly, the iterative nature of user-centered design is apparent in some of the papers. A paper might describe a series of user evaluations performed with the solution. For example, Mauriello et al. (2014) studied wearable E-textile displays to support group running first with an internal pilot study, then with a field study consisting of one-hour sessions with ten groups of runners, and finally a field study where four participants ran in a race wearing the prototype. For some prototypes, the consecutive user research activities are described in several papers. For example, Social Displays were first evaluated on a concept level in focus groups (Jarusriboonchai et al. 2015b) and then in a field trial where 13 users used the prototypes for 10 days (Jarusriboonchai et al. 2016b).

To start with the papers that feature less emphasis on evaluation, 20/92 papers present no evaluation. Many of these papers describe early concepts or technical details. Alternatively, they may shortly refer to a user study but the study itself is not extensively reported in the paper. For example, Cowan et al. (2010) outlines the design space of social projector phone applications by describing and discussing a set of scenarios, with the aim of raising discussion and exposing opportunities for future research. In the case of CommonTies, the prototype was first described from technical perspective (Abouzied and Chen 2014), and a follow-up paper covered an evaluation of the social effects of the prototype in a realistic context of use (Chen and Abouzied 2016).

The majority of prototypes (61/92) were evaluated with users, but the social aspects were not the main focus of the evaluation. In many such cases the evaluation did not concern social effects at all, but it focused on other aspects, such as usability or task completion. For example, Pass-them-around (Lucero et al. 2011) was evaluated in five group sessions of four friends. The evaluation looked at the relevance of the prototype and the photo-sharing strategies, and the naturalness, ease of learning and use of its interaction techniques. While these are important viewpoints, the prototype’s effectiveness in addressing the goal of social enhancement was not studied. Similarly, Fish-Pong (Yoon et al. 2004) was evaluated in a laboratory setting by inviting students to come and discuss the topic of social interaction in public places at the Fish-Pong table. The paper reports positive reactions to such technology and participants’ assumptions on its potential for ice breaking. However, the positive reactions may be partly explained by social desirability bias and novelty effect. More importantly, the study did not measure the social effects of the prototype or its actual effectiveness in the role of icebreaking. The Bluetooth-based mobile profile matching application, Social Serendipity, by Eagle and Pentland (2005) was tested with users and iterated for almost a year in different settings, such as a conference and a university campus. However, the studies focused on usability issues, general user reactions and privacy concerns.

Finally, only few (11/92) of the evaluation studies focus on understanding the social effects engendered by the prototypes. All of these papers involved either a longitudinal perspective or an in-the-wild study. Most of these are very recent, which implies that the evaluation methods and measured attributes have expanded over time. For example, Moment Machine (Memarovic et al., 2015) is an application for a public display with an integrated camera that was deployed in-the-wild for 12 weeks. The study analyzed 13 interviews, 3 weeks of observations and engagement logs, as well as the photos that different users took with the application. The focus of the research was on the social effects of large public displays, particularly in how they can stimulate community interaction. MoodSqueezer (Gallacher et al. 2015) is a public installation inside an office building. It was first deployed for 4 weeks, and then left in place for additional 4 weeks. The analysis covered interaction logs, observations, informal interviews, an exit survey after the first 4 weeks and 25 interviews. McCarthy et al. (2004) evaluated two proactive display applications at an academic conference. The system, intended to increase the feeling of community, was set up in a three-day conference with about 500 event participants, out of which 40% became active users of the system, and others were likely to observe the information presented on the public displays. Their data collection included systematic observation, short informal interviews, and a follow-up survey that was answered by 94 conference attendees.

Despite the breadth of design objectives and approaches, the efforts to evaluate the social effectiveness or the overall user experience of the prototypes have been limited. Some of the studies have employed evaluation methods and research settings that would allow analyzing also the social effects. However, the evaluations focused on other aspects, such as privacy or acceptability, and the attempts for evaluating the social effects are mostly limited to using self-defined interview questions. Our review shows that the prototypes based on public displays or interactive installations have been most fruitful for evaluation purposes—probably due to their public nature and the social opportunities they provide. Overall, even though HCI puts emphasis on evaluation, the limited evaluation effort is understandable: as the very notion of enhancement and its various manifestations have not been properly conceptualized, operationalizing such aspects to evaluation measures and methods is challenging. The unfortunate aspect is that many of the interesting and potentially effective solutions remain design explorations with unexplored social effects. This empirical gap calls for theoretical work that provides actionable evaluation frameworks and measures, as well as empirical follow-up studies to deepen the understanding of what roles technology can optimally assume in collocated social interaction.

6 Discussion

This section first summarizes the key findings and research directions for the future, then discusses and further conceptualizes the aspect of enhancement, and finally reflects on the methodological validity of the literature review.

Enhancing collocated social interaction with technology is an emergent research topic that has gained increasing interest especially since the early 2010’s. Relevant contributions are both numerous and demonstrate a broad variety of design concepts, which confirms the timeliness of providing a proper outlook over the research landscape. The review outlines a breadth of interesting designs that embody various target areas, design objectives, and design approaches. In addition to the design objectives, approaches and focus areas, the review shows various theoretical concepts and viewpoints that are critical to understand in order to advance the theoretical foundations and outline new design directions for this topic.

6.1 Directions for future work

Many studies seem to be motivated by the exploration of new technology (e.g., Bluetooth, wearable displays) rather than by a well-defined social issue, theorized problem, or user need. This means that our bottom-up review could not identify detailed social problems but, rather, a broad array of design objectives and design approaches. Consequently, we call for better articulation of the aims of the design. For example, what can be considered as the success criterion of the design, and in what regard a design is intended to make a social situation more desirable?

Similarly, the theoretical foundations discussed in Section 2, such as the proximity levels by Hall (1963), have been used to a limited degree to drive design work or evaluation studies. The various concepts and theories about interpersonal interaction are rarely utilized, even though we argue that many design explorations would benefit from more profound sociological and social-psychological analyses. More effort could be put into understanding the rules, practices, situational settings and other social factors that shape this hybrid space of social interactions and human-technology interactions. Design activities should also consider what is the existing ecology of technology used in the targeted context. Many of the analyzed designs do not explicate what kind of social interactions and human-technology interactions already take place in the targeted context.

Perhaps as a consequence of the rather limited theoretical basis, another common observation is that the design contributions are typically generic rather than specific. Many prototypes are intended for any user groups or contexts, or the focus areas are not explicated. While this is understandable from the viewpoint of design exploration and the development of technical enablers, we argue that future design endeavors would benefit from more deliberate choices of specific phenomena, social settings, target user groups, or type of interaction—particularly those that aim to actively enhance the quality of social interaction. We believe that future work should go beyond the classroom, corporate or event contexts that appeared in many of the papers.

Many of the reviewed papers state that the aim is to encourage social interaction but the actual objectives seem to be more modest and are often about inviting or supporting interaction. Considering Ludvigsen’s (2005) levels of interaction, technology seems to be most utilized to alter the situation from ‘distributed focus’ to ‘shared attention’. The prototypes that encourage, incentivize or even trigger interactions—i.e., aim at ‘dialogue’ or ‘collective action’ (ibid.)—could be considered to most strongly manifest enhancement, similar to those that engage people in collective activity. However, in this corpus, such prototypes are in the minority. The majority of the prototypes focus on more lightweight roles, such as increasing awareness or enriching the means of interaction (approximately half of the papers). These two categories could be explained by a latent premise in ICT applications: the more information, the better. This could also be seen to relate to the contact hypothesis (Allport 1954) when considering how social encounters can be facilitated by increasing mutual awareness.

As for more specific gaps identified in the corpus, one is that there are few solutions that aim to sustain interaction that has already been initiated. While much of the research focuses on initiating encounters between strangers, future work could focus more on, e.g., maintaining family relationships or long-term friendships that can deteriorate over time. The system for facilitating parent-child communication by Chan et al. (2017) is an interesting recent example of this type of work. Sustaining interaction could also mean alleviating social fears in encountering strangers or providing cues for the discussion dynamics and dominance during a dyadic conversation like in a recent paper by Muralidhar et al. (2016). Another observation is the lack of research and design solutions addressing the introduced issue of phubbing, and, in general, technology disrupting ongoing interaction. While we have seen various campaigns that discourage or even prevent the use of personal devices in social situations, surprisingly little focus has been put on user interface designs for avoiding the disruption of interpersonal interaction and social gatherings, therefore minimizing the social effects of the inevitable disruptions.

We also see untapped potential in designing technologies for crowds of people, a design space outlined by, e.g., Reeves et al. (2010) and Roughton et al. (2011). Different kinds of digital hosts or mascots, for example, could strengthen the sense of community in large events and encourage social encounters within temporary groups of collocated people. Recent interesting avenues that can broaden the design space of social enhancement include talking agents as social facilitators (Porcheron et al. 2017) or digital ‘confederates’ (Krafft et al. 2017). Overall, considering the extent of theories on social interaction, there seems to be an underexplored design space for other types of enhancement beyond what we summarize in Tables 2 and 3.

While the questions of how and why information technology could take a meaningful position in social interaction are extensively reviewed in this paper, the question of which of the approaches best enhance interpersonal interaction remains unanswered. From the perspective of evaluation, a key finding is that few papers have assessed the proposed solutions’ impact on interpersonal interaction, relationships, or other aspects pertaining to social interaction. The lack of depth in evaluation (e.g., teasing out the behavioral impact of technology) is a common phenomenon in engineering-driven HCI research (Väänänen-Vainio-Mattila et al. 2015). Understanding which design approaches actually work, and in which kind of social settings, requires research focus in the future. We speculate that this partly results from the need to develop scalable measures, as well as the challenges in organizing realistic study settings. This calls for methodological contributions specific to this topic. Analogously to what NASA-TLX (Hart and Staveland 1988) or SUS (Brooke 1996) are for the concept of usability, enhancing social interaction would benefit from similar well-established evaluation instruments. The understanding of enhancement and collocated social interaction should be operationalized into guidelines that help consider its various aspects as well as measures that help assess the goodness of the developed solutions. Moreover, considering that the various contextual factors in the evaluation are critical for the success of any social catalyst design (Heinemann and Mitchell 2014), we need a deeper qualitative understanding of different social settings and their unique characteristics.

All in all, the reviewed papers indicate that there is room for new technology that supports—rather than disrupts—people in collocated situations. A techno-critical viewpoint could argue that preventing the use of technology in, for example, public spaces and social events would solve some issues related to ignorance and isolation due to technology use. However, the morality of regulating the use of personal devices is questionable, and the issues possibly solved with regulation are not the only problems pertaining to collocated social interaction. Similarly, as CMC has fundamentally augmented social experiences between remote users, we believe there can be socially acceptable technological solutions to truly enhance collocated interactions. This creates an interesting application area for technologies related to autonomy and proactivity (Tennenhouse 2000), persuasiveness (Fogg 2002), and socially aware computing (Lukowicz et al. 2012). There is space for the development of technical enablers and novel services for proactively inviting and encouraging new encounters, as well as for the redesign of existing systems and interfaces to better cater for the dynamics of collocated social interaction.

6.2 The many roles of enhancement technology

The following further conceptualizes ‘enhancement’ in this research context. By revisiting the identified categories of social design objectives and design approaches, we abstracted them into different forms of enhancement; i.e., roles or positions for technology. The abstraction was aimed to provide a bigger picture of how enhancement can be embodied in the context of collocated social interaction. It provides a hierarchy and a vocabulary to help consider the roles of technology on different levels of abstraction. We hope this representation serves as a step towards a more fine-grained vocabulary about the enhancement of collocated social interaction and helps researchers and designers to determine specific design objectives and approaches for the different roles.

The resulting conceptualization stems from a meta-analysis of the categories in Sections 5.1 and 5.2, and the identification of common themes across them. We started by unpacking the approximate continuum identified across the design objectives (summarized in Table 2). Next, we analyzed how the categories about design approaches relate to the different design objectives and the three categories identified thus far. After collaboratively iterating the categories, we concluded that technology can take four main roles in collocated social interaction: enabling, facilitating, inviting and encouraging (Table 4). Of these, we consider the three latter ones to manifest the concept of enhancement and to differentiate the work outlined in this review from more conventional CSCW research. As the framework is based on an extensive literature review, it refines and extends the previously mentioned framework by Benford et al. (2000) and concretizes the topic with various prototypes presented in past work.

Table 4 Mapping the social design objectives and design approaches interpreted from the papers to abstract enhancement categories (Roles of Technology).

Enabling interaction refers to the role of a technological artifact making it possible or allowing for social interaction to take place, which represents an extensive body of prior CSCW and HCI systems. The design solutions provide platforms and opportunities for social interaction (either as the primary or a secondary activity). The users involved have the power to decide whether to use the opportunities or not; the designs do not particularly invite or encourage the users to behave in a more social way, or they do not actively facilitate interaction. Consequently, the resulting social interactions would largely depend on the current social setting; for example, the social norms in place, the actors’ relations to each other, and whether or not there is interaction already. Although this review has intentionally disregarded much of such work, this category is nevertheless highly important to consider when designing for collocated social interaction. In some situations, active forms of enhancement might not be possible nor desirable. For example, the social setting or characteristics of involved participants might be hard to foresee, and thus a solution that takes a stronger role in interpersonal interaction might be unacceptable.

Facilitating interaction refers to making it easier to converse, collaborate or otherwise socially interact, or to support desirable feelings, equality or suitable interaction dynamics while doing so. This role aims to relieve tension and minimize other negative experiences, maximize interaction related aspects and feelings that are considered desirable, and generally help make the best out of a social situation. In this corpus, facilitation primarily refers to supporting ongoing interaction, but some papers also aim to ease the initiation of a new encounter (or icebreaking therein) in situations where people are expected to interact. However, as with enabling, the intention or need to interact has been defined by the involved users or, e.g., a community manager or a teacher of a class, rather than by technology. Considering the latter, Kreitmayer et al. (2013) discuss this role in terms of ‘orchestrating collaborative activities’. Here, the design approaches of providing information about others or topic suggestions could help nurture or enrich an ongoing encounter. Interestingly, none of the identified design approaches address only this category. For example, open space for shared activity can be useful in this role but also in all the other roles; it seems to be a generic approach to employ in any role.

Inviting interaction is about the role of informing people of the available proximal social possibilities, which can motivate to spontaneously engage in new encounters. It is about signaling one’s interests and availability with the help of a digital medium in situations where the social opportunities (or social affordances) can seem non-existent, vague, or too excessive, or when people are not intentionally searching for company. Given that the current use of devices may lead to isolating oneself, this is also about avoiding the risk of isolation in social situations. Technologies playing this role provide external motivators, i.e., reasons, for initiating new interactions and being socially open-minded. However, the users may freely decide whether to act based on the provided information and social signals or not. From a systemic perspective, inviting interaction can be about situations where collocated or nearby people have no particular intention to interact but there is an external interest to foster this (e.g., in educational settings, public spaces, or used by an organizer of an event). Several design approaches (disclosing information about others, matchmaking, self-expression) particularly address this form of enhancement, which means that the largest portion of prototypes in our corpus manifest this role. Having said that, while primarily providing additional information to a user, the designs should carefully consider in which situations to do that and how much information to provide. An excessive amount of information can not only lead to information overload, but also strengthen social isolation due to the user having to interact with a device.

Encouraging interaction is about incentivizing or persuading people to start interacting or maintaining ongoing interaction. This means not only providing opportunities, but also utilizing computational features that nudge and stimulate people to take action (for example, to grab the given social affordances or to get involved in collaborative activity partly mediated by technology). For example, technology could make a subtle intervention when one does not dare to say something to someone else, or it could encourage two strangers to collaborate on something they seem to have a common interest in. Here, the approach of introducing constraints (e.g., with asymmetry) provides an interesting and contradictory design space to explore new forms and paradigms of computational solutions. From all the categories about design approaches, this is perhaps the only one that provides a means for advancing beyond the conventional information-centered approaches. Looking at the numbers, this form of enhancement seems to be the least common in our corpus.

Contrasting with the framework by Benford et al. (2000), our conceptualization introduces new roles of facilitating and inviting. Particularly, inviting interaction is prevalent in the corpus, considering the breadth of both the design objectives and approaches. At the same time, the corpus includes very few examples of enforcing interaction, the third of Benford’s categories (for example, creating a situation that coerces people into interaction). Therefore, enforcing was not included as one of our enhancement categories. This suggests that the fear is that enforcement may produce negative consequences for social interaction as reflected in the designs reported so far. This gap between what has been envisioned and what has been actually designed calls for new design endeavors as well as developing more relevant frameworks. Furthermore, as our roles emerged bottom-up from the corpus, we acknowledge that there might also be other relevant categories of enhancement that future technology development could consider. Other possible roles beyond encouragement, as well as other manifestations of each role, remain as future research questions and avenues for design exploration.

Overall, the goal of social enhancement is challenging to conceptualize because it expects technology to take stronger agency in social situations. Although technology is already taking increasingly strong positions in dictating what people do and, for example, what digital content they consume, influencing collocated social interaction is still an uncharted territory. Social interaction between humans is generally considered as a spectrum of delicate activities and behavior that is defined by the situation, the involved individuals and their interests, and numerous other factors. In this context, giving agency to blatantly non-intelligent and insensitive technology can seem unacceptable or undesirable for many, very understandably. While this review has charted recent ventures towards more socially active technologies, the optimal ways for technology to participate or intervene in interpersonal interaction remain open questions.

6.3 Methodological reflections

Regarding the validity of this work, literature reviews face the inherent challenge of coverage and sampling. An example of possible bias in this work is that, due to the challenges in identifying relevant publications from the plethora of HCI and CSCW outlets, the corpus includes more papers from the publication outlets that the authors know the best. One notable limitation is that we did not systematically review the proceedings of the ACM CHI conference. This intentional decision is explained by the sheer number of papers over several decades and the consequent challenge to identify truly relevant papers from the vast spectrum of research topics at CHI. However, looking back at Table 1 and the importance of CHI in the corpus, we admit that the review would have benefitted from a more systematic analysis of CHI papers.

As for conceptual limitations, especially in the early phases of the review we found it challenging to define the boundaries with respect to which solutions merely enable and which, for example, facilitate interaction. The diversity of terms used for ‘enhancement’ and ‘collocated social interaction’ made it challenging to identify publications related to the more active and daring forms of enhancement. This is natural due to the lack of established vocabulary. Hence, our review methodology has most likely failed to identify all relevant publications, and, in contrast, has included some others that do not perfectly represent the intended focus. Nevertheless, the review covers a significant number of interesting, most of it very recent, work that clearly share an agenda of positively influencing interpersonal interaction. We argue that our corpus of 92 publicationsFootnote 8 is an extensive sample of the kinds of research we aimed to review for the selected research questions, supporting our review methodology over a keyword-search based approach. The corpus allowed us to create a holistic picture of the research landscape related to this topic and clarify the central concepts. Furthermore, the other approx. One hundred cited publications provide a sufficiently solid theoretical basis for conceptualizing this emergent topic and reflecting on other prior work.

7 Conclusions

This review presents a fresh perspective on collocated interaction, outlining the design landscape of enhancing collocated interaction with information technology. As the first systematic and extensive literature review on this topic, our work has primarily surveyed the prior constructive work and designs, and, secondarily, theories that we consider most relevant to the topic from the perspectives of HCI and CSCW. We identified an upward trend in design research that looks beyond enabling and aims to actively enhance the quality of collocated social interaction. We outline various relevant design objectives and approaches, as well as general focus areas in prior research. However, the review does not allow giving a definite answer to the question of how to design technology for enhancing collocated social interaction, at least not in terms of what approaches and forms of design best fulfill this goal.

This review also helps identify research gaps to address in the future. We particularly call for:

  1. 1.

    design and research endeavors related to encouragement and other active forms of enhancement,

  2. 2.

    new methods and metrics that better serve the evaluation of the social effects of the developed solutions, and

  3. 3.

    stronger utilization of theory from social sciences and communication sciences and focus on the various relevant design objectives, approaches and concepts that already exist and that this work has outlined.

Furthermore, this work contributes a conceptualization of the topic and particularly focuses on the concept of enhancement. Enhancement of collocated social interaction was found to take many forms: for example, providing information that might spark off social interaction (inviting), serving as icebreakers or tickets to talk (facilitating), or encouraging and motivating people to interact or engage in joint activities (encouraging). The categorization can also help constructing appropriate evaluation measures to assess the quality and effectiveness of envisioned solutions for enhancing collocated social interaction. We conclude that enhancement technologies aim to purposefully produce behavioral effects that improve the perceived quality, value or extent of collocated social interaction. Enhancement refers to the socially active roles of technology that, however, allow the involved people to retain sovereignty over their behavior. To concretize this abstract definition, the paper points to various well-defined design and research endeavors that reflect on the more specific categories of objectives or approaches.

All in all, this work helps with analyzing earlier research, describing new research contributions, and positioning them in the broad research landscape. The presented categorizations help define meaningful goals and approaches for prototype development and translating early design concepts into potential futures. They are meant as sources of inspiration, from which designers and researchers may adapt different aspects according to their specific design case and professional judgment. We hope that this work inspires more contributions that further explore the technological transition from computer-supported towards computer-enhanced collocated interaction.