1 Introduction

On-line social and digital media have blurred the distinction between on-line and off-line relationships between people. Connections that initially occur face-to-face may be managed on-line (e.g. via Facebook), whilst connections initially made on-line may transition to face-to-face (e.g. connecting with someone on Tinder (www.gettinder.com)). It is also common for relationships to be managed dynamically between both on-line social media services and face-to-face. Within CSCW and wider HCI research, this interwoven nature of on-line and face-to-face interaction has led researchers to consider how both can be blended together. In particular, by provoking and supporting face-to-face interactions between strangers through increasing the visibility of an individual’s on-line social and digital media in the physical environment. Whilst historical work, such as (McCarthy et al., 2004), presented information about a nearby conference attendee on a large screen display as a way to provide ‘tickets’ for interaction (Sacks, 1992, p. 265), more recent work has focused on wearable devices, such as head-mounted displays (HMDs) (Nguyen et al., 2015), digital badges (Jarusriboonchai et al., 2015), or bracelets (Chen and Abouzied, 2016).

There is significant potential in this work and implications for CSCW. Individuals have been found to be open to interactions with others in a wide range of contexts and physical places (Mayer et al., 2015, 2016). Everyday social encounters, where individuals will meet without an a-priori purpose, are one of the most common situations in which we encounter people in everyday life, and have high potential for face-to-face digital augmentation to be useful (Svensson and Sokoler, 2008). This can range from interaction at a party or networking event, to short conversations on the street or in shops. Although only a minority of these are likely to lead to a deeper relationship (Rubin, 1974), even short social interactions can improve health and wellbeing (Holt-Lunstad et al., 2010). More widely, CSCW has identified a range of situations where it is important that strangers form a relationship with each other quickly, irrespective of their similarity, such as in ad-hoc teams. Team performance has been identified to improve if members are more familiar with each other before a team is formed (Littlepage et al., 1997). For example, Wong and Neustaedter (2017) have identified the importance of such familiarity amongst flight attendants. A situation where teams must form quickly, with limited time to ‘get to know’ each other. Digital augmentation may be an effective way to support, or speed up, this process.

However, existing face-to-face augmentation work has significant limitations that restrict its potential. Firstly, such work treats users’ social and digital media as a raw resource that can be algorithmically mined to identify and present shared interests between individuals (Nguyen et al., 2015; Abouzied and Chen, 2014; Chen and Abouzied, 2016). Whilst the results of this can be presented in low-fidelity (such as a flashing bracelet (Chen and Abouzied, 2016)) to limit disclosure, other than providing access to a social or digital media account individuals have no control over what media is selected and presented to others about them. This increases concerns over unwanted disclosure of information, leading to loss of face (Goffman, 1969; Farnham and Churchill, 2011). Algorithmic matching also assumes the purpose of the face-to-face interaction is known a-priori, so the algorithm can be tailored to match on the most relevant, similar, information between individuals. Yet, particularly in the everyday social interactions that are most common for strangers to interact in (Svensson and Sokoler, 2008), the purpose of interaction may not be know a-priori. Moreover, by presenting only similarity between two individuals the diversity of face-to-face encounters may be limited, as individuals see only a ‘filter bubble’ of those they are similar to (Pariser, 2011). Finally, existing work (Terveen and McDonald, 2005; Abouzied and Chen, 2014; Chen and Abouzied, 2016) has studied only one-on-one face-to-face interaction. Not all face-to-face interactions involves only two people, rather groups form, separate and re-form dynamically (Bussmann and Schweighofer, 2014). Matching two individuals simply provides a ‘ticket’ to interaction (Chen and Abouzied, 2016), but ignores the prior step of how an individual might browse a room and identify the most interesting people to interact with face-to-face, such as when a person starts in a new workplace and must ‘get to know’ his or her colleagues. Such social interaction and establishment of common ground has been shown as important for effective teams within an organisation (Lykourentzou et al., 2017).

In this paper we focus on expanding existing work beyond one-to-one interaction by considering how face-to-face augmentation can support multi-party groups. We do this by firstly considering how individuals would wish to present themselves to others through curated digital facets that can be used to augment face-to-face interaction (which we term Digital Selfs) (see Figure 1). By providing full control to individuals over what media is used to present them to others, we can better understand how individuals would wish to be presented, and overcome issues of algorithmic matching outlined above.

Figure 1
figure 1

An illustration of using Digital Selfs in a conversation on the left, and a photograph taken through head-mounted display (HMD) for visualising how the Digital Self appeared to users

2 Related research

In elaborating on our argument we consider two perspectives. Firstly, we consider existing work to provoke or support real world interaction through the presentation of social and digital media. We then further elaborate on the three stages of face-to-face interaction (browsing individuals, ice-breaking and supporting conversation).

2.1 Digital media in face-to-face interactions

There has been diverse work in considering how to incorporate digital and social media into face-to-face interaction. Existing work is often piece-meal, focuses on only one stage of face-to-face interaction (browsing individuals, ice-breaking or supporting conversation), and its impact is often not evaluated. In discussing this work we consider two key axes (automatic vs. manual selection of media, and visualising nearby individuals vs. face-to-face augmentation).

2.1.1 Automatic vs. manual media selection

Existing work has largely focused on automatic selection of media from user social and digital media accounts. Users provide access to their accounts and media is automatically selected by the system. Whilst this can be directly presented (McCarthy et al., 2004), it is more common that media from two individuals is processed to identify and present their similarity. Jarusriboonchai et al. (2015) did this in a very simple way by presenting interests that two participants had ‘liked’ in their public Facebook account profiles. More sophisticated matching algorithms have been used by Nguyen et al. (2015), who presented potential conversational discussion topics between two individuals wearing HMDs. Suggestions were the output of a matching algorithm run largely against two participant’s Linked-In social media accounts. In all such work, individuals have no input, other than providing initial access to their account (or not), to control what is used or revealed about them. However, as shown by Goffman (1969), individuals are multi-faceted, performing a facet of self during face-to-face interaction. The props (which include digital face-to-face augmentation) individuals incorporate into their performance can either support or undermine this performance. Facets can be incompatible with each other, causing a detrimental impression if they were to ‘leak’. Individuals can go to great lengths to keep their facets separated (Goffman, 1969). Facet management has also been identified in how individuals present and manage their on-line identities in social media services. Farnham and Churchill (2011) identified that individuals use multiple social media services to present different facets of self to different groups.

Whilst automatic selection has potential advantages, it does not take these issues into account, and has not considered how individuals would wish to use digital and social media to augment themselves in face-to-face interaction. Existing studies make these decisions before participants are involved, who are given no input into the process. This increases the risk that unwarranted information is disclosed, which may damage interaction. Just because two individuals match on a particular interest or topic does not mean they would want it disclosed.

Existing matching work has, implicitly at least, considered privacy. On what users are matched can be obfuscated. Chen and Abouzied (2016) developed a matching system where individuals wore a bracelet. If the matching score between two individuals crossed a threshold the bracelets participants wore flashed the same pattern and colour. No information on what participants were matched on was provided. However, more often privacy is considered through either presenting only very basic information, (Jarusriboonchai et al., 2015), or focusing on specific, largely pro-fessional networking, scenarios (Nguyen et al., 2015). Where individuals would be expected to present a more general, professional facet of self.

Additionally, algorithmic matching focuses on finding the similarity between two individuals, ‘weighting’ data to prioritise and match on the most relevant information, given the current situation. This assumes there is a clearly defined purpose or goal to the interaction that can be predetermined. However, as Mayer et al. (2015) have identified, the situation can significantly alter who individuals would wish to connect with. Individuals may wish to connect with those they are similar with, or with those that they are otherwise dissimilar to.

An alternative approach is for users to curate media and decide what should be presented. In this way individuals are choosing what media to represent themselves with, can have control over how their identity is disclosed (in-line with existing understanding of face-to-face interaction (Goffman, 1969) and identity management in social media (Farnham and Churchill, 2011)), and can decide on what information is most important to connect with others on. Persson et al. (2005), for example, developed a system allowing users to curate ‘profiles’ on feature phones that could be broadcast over Bluetooth to others in the nearby geographical area. Whilst users were keen to browse others in the area, a low number of users meant others were often not found.

Whilst this approach removes many of the issues with automatic selection and matching outlined above, there is significantly less study of how such manual curation and presentation of media would impact single or multi-party face-to-face interaction, or how individuals would choose to represent themselves. Recent work tends to focus on novel technical prototypes with only informal evaluation (Kan et al., 2015; Kao and Schmandt, 2015; Devendorf et al., 2016). Only our prior work (McGookin and Kytö, 2016; Kytö and McGookin, 2017), which motivated this work, has investigated user views on how they would wish to use existing social and digital media in augmented face-to-face interaction. It revealed that the same issues of identity, faceting and boundary regulation (Lampinen, 2014) existing in face-to-face interaction and existing social media use, and unconsidered in automatic matching work, also exist in consideration of how such media should augment face-to-face interaction, as well as how individuals use such representations in strictly one-to-one interaction. However, this work has not yet extended to investigate the impact in multi-party interaction. We do so in this paper.

2.1.2 Nearby vs. face-to-face

Existing work focuses on either increasing awareness of suitable others in the nearby environment or focusing strictly on conversation in face-to-face interaction. Commercial work, for example the Tinder (www.gettinder.com) and Grindr (www.grindr.com) dating services, consider ‘nearby’ to be at the city or municipality scale, presenting relevant individuals at least several kilometres away. Chen and Abouzied (2016) developed an LED bracelet to support networking amongst strangers at a conference. The bracelets of two users, who were algorithmically matched, would flash the same pattern and colour when users came within 20 m of each other. Chen and Abouzied (2016) noted that whilst the face-to-face interaction the bracelets provoked was often valuable, it could be difficult for individuals to find their matched partner, with only 15% of identified matches resulting in face-to-face interaction. Work such as Nguyen et al. (2015), on the other hand, focuses specifically on how face-to-face interaction itself may be augmented. They presented algorithmically determined conversational topics between two strangers via a pair of Google glass. Topics were found to be useful in starting and sustaining conversation.

However, there is not a strict delineation between what is and is not face-to-face. For example, Jarusriboonchai et al. (2015) presented matched interests between two stranger’s Facebook profiles presented on badges (mobile phones worn around the neck). Whilst these functioned as ‘tickets’ in face-to-face interaction, they could also potentially act to support awareness of others. However, Jarusriboonchai et al. (2015) studied only situations with two individuals, so did not investigate this. The distinction is therefore how nearby individuals are, rather than if they are strictly face-to-face when media is presented. In this work we consider face-to-face to be within sight of an individual. This means that within the same room or immediate physical proximity. As such, our definition for face-to-face interaction is close to Goffman’s definition for ‘Social situation’. That is, two or more people being in one another’s immediate presence and sharing the same spatial environment so that they have possibility to mutually monitor each other (Goffman, 1963, p.18). By this definition both Jarusriboonchai et al. (2015) and Chen and Abouzied (2016) (and of course (Nguyen et al., 2015) as well) would be considered face-to-face interaction, whereas Tinder would not, as the scanning nearby people could be done without sharing same spatial environment.

2.2 Stages of face-to-face interaction

In considering face-to-face interaction in multi-party conversation, there are three key stages: browsing individuals, ice-breaking, and supporting conversation. Existing social psychology research has provided understanding of how each of these phases works without digital augmentation. Existing work within the HCI and CSCW communities, such as discussed in Section 2.1, has supported some of these stages, with a greater or lesser understanding of the social underpinnings. None supports, or has been evaluated in, multi-party interaction and conversation, with only Chen and Abouzied (2016) and McCarthy et al. (2004) being investigated in multi-party settings (though they still focus on one-on-one interaction). We outline these three stages in more detail and highlight existing work that supports them.

2.2.1 Browsing individuals

When considering our context of multi-party interaction with strangers, individuals first need to identify others who they would like to interact with and if those people are open to interaction. Relevance can be deduced on physical ‘props’ (such as what individuals wear or carry), or how people interact (e.g., language), but the possibilities to spot relevance are limited based on only these properties. In certain situations the relevance of a person is obvious (e.g. a waiter in a restaurant), but in more social situations relevance is much more difficult to detect. To have a ‘safe’ choice, people tend to choose others that belong to the same group (e.g., same religion or country), especially, if belonging to that group is rare among all other people nearby (Goffman, 1963).

Whilst there is a large body of work on using a computational approach (i.e., automatic social matching, see Section 2.1.1) for finding a relevant individual in face-to-face interaction, little consideration has been given to user-curated approaches. Systems using user-generated content to allow individuals to express some element of Self have been largely limited to technical prototypes or have been only informally evaluated (see Section 2.1.1).

2.2.2 Ice-breaking

Finding a relevant individual and desire to talk with her/him does not guarantee that interaction occurs. Starting interaction between strangers is an example of a situation where persons have restricted rights, and mutual willingness is required. As Goffman (1963, p.124) argues: ‘...acquainted persons in a social situation require a reason not to enter into a face engagement with each other, while unacquainted persons require a reason to do so.’. Thus, a ‘ticket’ (an accepted reason known to the parties) is needed to ‘break the ice’ (Sacks, 1992, p.265). A ticket provides common ground between individuals and allows an interaction to occur. For example, a ticket might be something an individual carries (such as a book) or something in the environment (such as being on a broken down bus) (Sacks, 1992, p.265). Location and context have significant influence over obtaining a license to start interaction. For example, being present at a party with friends is an example of an ‘open-region’ environment (Goffman, 1963). In such environments all individuals are allowed to interact with others.

However, in public situations, such as at bus stops or queueing, individuals have much more limited rights to start interaction and require a ticket (Sacks, 1992, p.265). If individuals wish to interact with others, and an obvious ticket is unavailable, tickets can be synthesised such as by using subtle cues to express willingness to interact. For example, by asking time or directions. Non-verbal cues can also be used, the classic example being dropping something on purpose for others to pick-up (Goffman, 1963). As such, the availability of tickets and ways to express willingness to start interactions with strangers are generally limited in public settings (Goffman, 1963), and thus different approaches have been taken within HCI to support ice-breaking.

Within HCI, work on supporting ice-breaking has been studied mainly in professional contexts (e.g., such as at conferences (McCarthy et al., 2004)). Borovoy et al. (1998) used an LCD badge displaying simple ‘Memes’ that could be exchanged to incorporate digital information into ice-breaking, presenting overall trends of interaction on large-screen displays. Most other early work also used large screen displays, such as McCarthy et al. (2004), who showed information about an individual as he or she stood in line at a coffee station at a conference. The public display of such information was found to be an issue, as some users would have liked to regulate who could see information about them (McCarthy et al., 2004).

Jarusriboonchai et al. (2014) studied how proactive interaction (i.e., devices which identify and interact with each other before the user) could trigger conversation between strangers. They carried out a wizard-of-oz. style study where two strangers were brought into a room where two mobile phones started to make synchronous sounds or ask audible questions about the strangers. They found that mobile phones triggered conversations, but interaction was addressed towards the mobile phones rather than each other. Beyond standard mobile and wearable devices, there has been significant work on novel devices. For example, augmenting everyday objects (such as mugs (Kao and Schmandt, 2015) or handbags (Pakanen et al., 2016)), as well as e-textiles (Devendorf et al., 2016; Kan et al., 2015). Such work however is often conceptual or focused on technical prototypes, and does not investigate use in face-to-face interactions.

2.2.3 Supporting conversation

After ice-breaking, conversations often start with ‘setting talk’, consisting of safe, neutral topics (e.g. discussing the weather) (Maynard and Zimmerman, 1984). However, these are quickly exhausted (Svennevig, 2000, p. 222), limiting conversation unless participants can move to deeper topics (Svennevig, 2000, p.91) which are of interest to both parties.

To allow a deeper relationship to form, participants must disclose information to reduce un-certainty (Clatterbuck, 1979). Disclosure is a gradual process (Altman and Taylor, 1973). What participants are willing to disclose to others is not fixed, but is a dynamic process of boundary regulation that takes place during conversation (Lampinen, 2014). This is influenced by a number of factors, including prior relationship and context, but also more transient factors such as personal mood at the time. Successful ‘navigation’ through this process helps to deepen the interpersonal relationship, increasing social attractiveness (Douglas, 1990), and allowing more meaningful interpersonal relationships to form (Altman and Taylor, 1973).

Structurally, conversation is a turn-taking activity, where the next turn is allocated by the current speaker, or by self-selection (Sacks et al., 1974). Whilst in one-on-one conversation there is only a speaker and addressee (who will take turns at these roles), in multi-party situations there may be one or more unaddressed recipients (Gibson, 2003; Traum, 2004). This role arises when the speaker addresses the speech to a particular person directly (e.g., through saying the addressee’s name (Sacks et al., 1974), or directing gaze towards an addressee (Jovanovic et al., 2006)). The speaker and addressee are aware of the presence of unaddressed recipients, but un-addressed recipients are not recognised as part of the ongoing dialogue. Hence, in contrast to eavesdroppers, unaddressed recipients are involved in conversation, but they usually need to wait their turn until the speaker and addressee are satisfied with the current dialogue (Branigan, 2006). Thus, in multi-party situations, participants need to pay more attention to their current role in conversation, which typically shifts many times as the dialogue proceeds (Gibson, 2003).

A final important difference between one-on-one and multi-party interaction occurs when progressing conversation to both new topics (topic progression) and maintaining and developing the current topic (topic maintenance). Multi-party interaction provides more opportunities to do this, and as such there is often less coherence and predictability (Korolija and Linell, 1996). In large part this is due to participants being able to join and leave existing conversations dynamically (Traum, 2004). Joining a conversation causes a situation where communicators are participating at different conversational backgrounds (i.e., newcomers are not aware of what has been said before), leading to a breakdown of common ground. Newcomers must then either explicitly initiate a repair (to help establish common ground) (Clark and Brennan, 1991), or wait until their understanding of the conversation develops to become active participants (Branigan, 2006).

Whilst there is good understanding on how individuals engage in conversation, there is little understanding of how digital face-to-face augmentation can be used to support it. Nguyen et al. (2015), who studied how strangers can be supported in one-on-one conversations using algorithmically matched topics from individuals Linked-In social media accounts have considered this topic. Three topic suggestions were delivered through an HMD and updated every 2.5 min in a 15 min session. They found that topic suggestions were perceived somewhat useful in general and that they were more useful for introverts than extroverts. They also argued that the timing of topic suggestions should have been matched with points where the current conversational topic was exhausted.

Whilst the topics suggestions were somewhat useful, Nguyen et al. (2015) did not study be-yond one-on-one interaction. Therefore, issues around the use of augmentation in multiple-party interaction discussed above, were not considered. Any algorithmically matched system would need to be re-run as individuals joined or left a conversation, by definition changing the dis-played information. Having a user curated representation of self that would persist across these events may help individuals join conversations that are on-going. Moreover, as topic progression and topic maintenance are more difficult to predict in multi-party situations (Korolija and Linell, 1996), detecting the moments when the current topic is exhausted and changing it (as suggested by Nguyen et al. (2015)) is much more challenging.

2.3 Research questions

Whilst existing work shows value in using digital content to augment face-to-face interaction, it also has significant limitations. The primary focus of algorithmic or automatic selection of media removes the ability of individuals to control how they are presented to others, and assumes an a-priori purpose that an algorithm can select data to support. This limits the use of such approaches to clearly definable roles of face-to-face interaction. However, it is unsuited to the more common interactions that occur without an a-priori purpose (such as talking to strangers at a party). Allowing individuals to select their own media, and thus present what they want to connect on in multi-party situations, has been discussed within the CSCW community but has not yet be evaluated. We do not know how users would choose to digitally present themselves to others nearby, or the impact of that on face-to-face interaction in multi-party settings. Although such systems are argued to support connections in groups of individuals (e.g. at a networking event), none have studied how they are used in multi-party settings beyond two individuals, or how all three stages of conversation in such situations (browsing for individuals, ice-breaking and supporting conversation) are supported. To address these issues we carried out a two-party study to answer the following research questions:

  • RQ1: How do individuals choose to represent themselves to strangers with Digital Selfs?

  • RQ2: How are Digital Selfs used at each stage of interpersonal interaction between strangers in multi-party settings?

3 Study outline

Twenty-three participants (nine female, aged 18–42 years, M = 27.7 years, SD = 5.6) took part. Participants were recruited from both existing mailing lists and flyers placed on campus. Twenty were students, and the majority were Finnish (17/23). In part 1, participants created a Digital Self to represent a facet of themselves they would want to present to a stranger (a person they had not previously met) during face-to-face interaction. In part 2, participants took part in an augmented multi-party face-to-face event. Participants were randomly assigned to one of six separate events. Allocation was based on common availability for all participants in the event. In each, participants were equipped with a head-mounted display (HMD), and could access the Digital Selfs of all other people in the event as and when they wished. Participants were compensated with two movie tickets (approx. Value 20 euros) on completion of both parts of the study.

4 Part 1: creating digital selfs

4.1 Experimental setup

In the first part of the study we asked participants to create a Digital Self that they would be happy to use to augment their visual appearance in face-to-face interaction with strangers. Participants were told that the Digital Self would be used in Part 2 during face-to-face conversations in a small group of people, and it would be up to participants to “get to know each other”. We left this task deliberately vague, as meeting a stranger without a clear a-priori purpose is the most common, and potentially most useful, scenario where a Digital Self might be beneficial (Mayer et al., 2015; Svensson and Sokoler, 2008).

The Digital Self comprised of a single Microsoft Powerpoint slide (that would later be con-verted into an image). Although this limits the Digital Self to a static image, Powerpoint does provide an easy and flexible tool to support free-form creation of the Digital Self in content, layout and form (text, images etc.). We asked only that participants use English for any text they included and they keep the black background to better support display on the HMD. Otherwise participants were free to design the Digital Self in any way they chose, using as much or little media as they wished. We supplied a template Powerpoint presentation with a blank black background slide.

To help illustrate how the Digital Self would be seen by others, we reused a concept video that illustrated a basic face-to-face interaction (see Figure 2) from our earlier conceptual study (McGookin and Kytö, 2016). This illustrated a user, ‘Joe’, putting on a set of HMD ‘glasses’, walking down a hall and meeting ‘Mary’. When the HMD recognises Mary, it presented her Digital Self (that she created) to Joe as both had a face-to-face interaction. The Digital Self was represented as a red box in the edge of the display. This was chosen as it avoids obscuring eye contact, and is similar to prior work in face-to-face augmented reality (Nguyen et al., 2015). Participants were also aware we would use HMDs to present the Digital Self. Existing work has already shown these to be useful (Nguyen et al., 2015) in such augmentation. Unlike ‘heads-down’ presentation, where information can be missed during face-to-face interaction (e.g. Chen and Abouzied (2016) found that participants failed to notice half of the matches their flashing bracelet notified users of), information can be presented in line of sight and not interfere with face-to-face interaction (Ofek et al., 2013).

Figure 2
figure 2

Key frames of the video used to illustrate a Digital Self to participants. The video illustrates a scenario where a person meets another face-to-face, and can view her Digital Self via an HMD. Participants were asked to consider what would be displayed about them in the red box when creating their Digital Selfs. The video was taken from (McGookin and Kytö, 2016). Note, Frames 1 and 3 have been slightly cropped to aid readability in the paper

Part 1 was conducted remotely. This was done so that participants had time to consider what they wanted to include, and could access digital and social media accounts on their own devices, without needing to remember passwords and login on a device supplied by the experimenters. Participants were given several days to complete their Digital Self. Once participants had completed their Digital Self, they e-mailed the Powerpoint slide to us, completed on-line questionnaires about the created Digital Self and demographics, and were interviewed over Skype about the Digital Self they created.

Interviews were transcribed and thematically grouped using a framework approach (Ritchie and Spencer, 1994), with the choice of content, visual form and sources of content used as initial codes. The visualisations were also analysed according to the type and amount of content, and the visual representation used (images or text).

4.2 Results

4.2.1 Images vs. text

Participants used diverse content in their Digital Selfs. Participants were more likely to include images in their Digital Selfs than text. Nine Digital Selfs included both images and text, eight consisted only of images and six included only text. On average, Digital Selfs contained 2.4 images and 3.2 instances of text,Footnote 1 with participants favouring fewer but clearer items of content, rather than trying to include a lot of media in the small display space available. Most commonly, participants presented more general interests about themselves through objects (for example P13’s image of a football and P22’s image of pieces of chocolate in Figure 3), or facts about them (such as name, occupation or education).

Figure 3
figure 3

Representative examples of Digital Selfs, illustrating the types of media and composition used in the Digital Selfs. Note that personally identifiable information has been blurred in the figure, but was not blurred during the study

4.2.2 Image source

Unlike the use of automatic selection from existing media accounts (used by existing algorithmic approaches (Chen and Abouzied, 2016; Nguyen et al., 2015)), the majority of images used by participants came from outside existing social and digital media accounts, being sourced from the Internet (e.g. through Google image search, see Figure 4). Participants discussed how their desire to present themselves through images meant that suitable content could not be sourced from their existing social media services (P13: ‘I didn’t really have pictures that represented what I wanted to show. I wanted to show more of things and places and for example the logos that I have – I don’t really have them anywhere.’, and P22: ‘I wanted to get some images which clearly conveyed the thing I need to convey. And if I use picture from Facebook...it might create a confusion in the mind of other...picture should highlight the thing that I want to communicate.’). Five participants reported that they didn’t have many images in social networking sites, and for three participants social media was not considered when creating the Digital Self. For one participant the size of the Digital Self affected the decision not to use social media (P15: ‘I was first thinking Facebook cover photo, but then I was trying to put it in the small square and I thought that the picture wouldn’t look good in such a small scale so that’s why I didn’t choose that.’).

Figure 4
figure 4

A chart of the sources where images used in Digital Selfs were derived from

4.2.3 Incorporating ambiguity

Participants were explicit in desiring to incorporate ambiguity into the media they chose and its representation. For example, providing facts in text (such as name, age, and interests - see P5, in Figure 3) makes it obvious how a Digital Self should be interpreted. Showing images, particularly more artistic images, requires interpretation to understand, and dialogue to uncover what they mean or say about individuals (e.g. P9’s Digital Self in Figure 3). There were a number of factors that drove this choice, and why individuals chose to incorporate more or less ambiguity in the media they chose. For example, participants did not wish to be misinterpreted, or for others to interpret their Digital Self poorly (P21: ‘I guess people first look at the photo and then get impression immediately after looking at that photo, and I thought adding photo would create bias. So, they were better off just what I wrote.’). More commonly however, was that because the Digital Self would be used in conversation, media interpretation could be supported through conversation. This led to the use of images (P9: ‘It’s easier to give a not so simple impression of yourself in such an abstract picture. So ‘cause I know I can rely to the conversation so I don‘t have to have all the basic information about myself in the picture.’). In this way participants could provide a more ambiguous representation that may not make sense when viewed in isolation, but in conjunction with conversation could provide rich discussion. In this way such images could support boundary regulation (Lampinen, 2014). Participants may be open to discussing a topic, but the ambiguity of the representation in the Digital Self allowed the depth an individual would wish to talk about it to be dynamically determined through face-to-face interaction. This ambiguity was also a reason why media was largely sourced from outside existing social and digital media accounts (P1: ‘And it‘s kind of like with the social media, when you meet someone if you’ve already seen their Facebook profile, there’s really not that much small talk questions that you can ask about them because you know the basic stuff... rather it might be more interesting if there were some visual cues or some elements that I might be interested in asking about those people.’).

4.3 Part 1 conclusion

In creating a Digital Self, participants tended towards avoiding concrete textual facts, such as might be derived from a social media account, favouring using images to represent a more ambiguous interpretation of themselves. Such representations being seen as both a way to support dynamic disclosure of information (by disclosing more or less information about the media through conversation), but also as a way of opening up rich topics of interaction that support conversation rather than concrete textual facts about someone. This is in contrast to algorithmic matching work which presented textual detailed and concrete topics to discuss (Nguyen et al., 2015).

5 Part 2: events with strangers

5.1 Experimental procedure

Within two weeks of creating their Digital Self (see Section 4), participants were invited to one of six events held at the University campus. As we wanted to ensure participants were strangers to each other, and we were capturing how Digital Selfs might be used in initial, individual interactions, we ran the study in a controlled manner. Such an approach is a standard and valid technique that has been used in prior work on initial interactions (e.g. (Maynard and Zimmerman, 1984; Douglas, 1990; Tidwell and Walther, 2002; Nguyen et al., 2015)), and we based our procedure from this. In our study strangers were able to mingle freely with others and choose whose digital content they wanted to view (or not).

To ensure participants did not meet beforehand, they were directed to different entrances in the building and met by an experimenter, before being taken to individual rooms. Participants were first briefed on the purpose of the study, and were asked to complete a consent form. Participants were then shown a sheet containing a picture of each other person taking part in the same event, and were asked if they had previously had a conversation with any of them. If they had that participant would have been excluded from the study. However, no participants reported they had. This ensured that participants were strangers to each other.

Participants were then provided with an HMD headset (an EPSON BT-200, shown in Figure 5 (left)). This presented a set of facial images (see Figure 6 (A)). During the event these were the faces of the other participants, but a ‘dummy’ set of faces were used to familiarise participants with the HMD. By using the handheld touchpad of the HMD, participants could click on a face and view that person’s Digital Self (as created in Part 1) (see Figure 6 (B)). Again, ‘dummy’ Digital Selfs were used to familiarise individuals with the device. Users could view only one Digital.

Figure 5
figure 5

Left: The EPSON BT-200 HMD shown with its touchpad controller that participants wore to access the Digital Selfs. Right: The MeCam Classic wearable camera that participants wore to record interactions

Figure 6
figure 6

Screenshots of the Digital Self HMD application. a: Participants selected a Digital Self by first selecting the facial image of that person. b: The Digital Self of that person was then shown in the upper right corner of the display to avoid obscuring the person’s face. A ‘select’ button was used to return to the facial images and select another Digital Self. Note: the majority of the screenshots are black as this represents transparency in the HMD. Black areas can be ‘seen through’ by participants when viewed via the HMD

Self at a time to avoid cluttering their visual view. A button was provided to return to the facial images to select a different Digital Self. Whilst we piloted a number of different approaches to automatically select a Digital Self (e.g. through facial recognition or markers) and switch between recent ones, these would potentially constrain how the Digital Selfs could be employed in multi-party interaction, something we wanted to investigate as part of this study. Whilst manual selection will not scale to large groups, it is important to help us understand how Digital Selfs are used in multi-party interaction before applying automatic systems that will constrain their use.

When participants were comfortable using the HMD, they were told they would go to another room to interact with the other participants. In line with previous work on initial interactions amongst strangers, we avoided providing a specific task to participants. Participants were told that they should ‘get to know each other’. This is a typical task used in initial interaction studies of face-to-face interaction (Tidwell and Walther, 2002). Participants were instructed that they could interact as much or little as they wanted, and could use the Digital Selfs as much or little as they wanted (including not at all).

When all participants were ready, each was taken to the same room. This was a seminar room, having approximately 7 m × 4 m open space in the middle where the participants were able move without restrictions. Each participant was directed to a location around the open space, so all participants were equidistant but were stood at least 3 m from each other. This placed participants in the far social distance of each other (Hall, 1966, p.123). If participants started closer to each other than this, it would have appeared rude if they did not interact. In far social distance individuals can choose whether or not to interact with others (Hall, 1966, p.123). This ensured participants were all at a “browsing” stage (as discussed in Section 5.2.2).

During the event each HMD logged interaction with the Digital Self applications, such as opening and closing Digital Selfs. We also recorded a wide-angle video of the room (showing how participants moved in the space). To record interpersonal interactions, each participant wore a MeCam Classic wearable video camera around his or her neck (see Figure 5 (right)). Whilst participants could interact for as long as they wanted, we ended each session after 45 mins. Previous studies on initial interactions studying one-to-one case have been as short as a couple of minutes (Douglas, 1990), but we wanted to increase the length as there are more interactions to study in multi-party settings, and we were interested in how the interaction developed after this initial phase. No participants stopped earlier than 45 mins.

After the 45 min session, participants were asked to complete a Likert based questionnaire (1..7 scale, 1 = strongly disagree, 7 = strongly agree). This covered use of the Digital Self, interaction with other participants, and the interaction situation itself. As Nguyen et al. (2015) have suggested that the usefulness of topic suggestions is influenced by how introverted/extroverted a person is, we also administered a ‘Big Five’ personality trait questionnaire (John and Srivastava, 1999) to measure personality type. Participants then took part in an audio recorded group interview. This covered their overall experience, and how the Digital Selfs were employed. Overall each session took 1.5 h.

5.1.1 Analysis

Interviews were transcribed and coded using a framework approach (Ritchie and Spencer, 1994), using stages of face-to-face interaction (browsing individuals, ice-breaking and supporting conversation) as initial codes. Room videos were analysed to identify how and when participants formed, joined and left groups. From the worn video cameras we carried out a first pass to identify where participants started or joined conversations, and where they incorporated Digital Selfs into those conversations. We then transcribed and analysed these sections in more detail. On-device log files were used to determine when Digital Selfs had been opened and closed. We then triangulated between these data sources to understand how Digital Selfs were used in conversation. Questionnaire responses were graphed and one-sample t-tests were used to statistically compare responses to the neutral Likert score.

5.2 Results

5.2.1 Overview of events

Figure 7 provides an overview of how Digital Selfs were employed within the three stages of conversation across all events. Participants employed Digital Selfs at all stages. However, use of Digital Selfs did not dominate interaction. 11 out of 23 participants used Digital Selfs to browse individuals before approaching, forming initial subgroups with them, and starting face-to-face interaction. Subgroups can be seen from Table 1. Twelve participants moved directly to interaction without accessing Digital Selfs first. We discuss this more in Section 5.3. Digital Selfs were mostly accessed and used at the beginning of the Events (see Figure 8), and in 5 out of 9 of the initial subgroups content was employed as a ‘ticket’ to support ice-breaking. Overall, Digital Selfs were useful for this role, and helping to start conversations. Participants agreed with a statement: ‘The other person’s Digital Selfs helped to initiate conversations.’ with a mean score of 6.0 (S.D. = 1.4), which differs significantly (One sample t-test, p < .01) from the neutral score (4).

Figure 7
figure 7

An illustration of possible transitions through the three stages of conversation, and the proportion of participants (for browsing and supporting conversation) out of 23, and initial conversational groups (ice-breaking) out of 9, that took them. The solid arrows represent the possible transition between phases when the Digital Self was not used, and the dashed lines present possible transitions when the Digital Self was used. The two participants who ‘skipped’ ice-breaking, took longer to browse individuals and thus joined groups that were already formed and engaged in conversation

Table 1 . An overview of how participants in each event formed initial subgroups. Apart from event 3, all subgroups eventually combined into one discussion group. No subgroups split and reformed.
Figure 8
figure 8

Switching between Digital Selfs was the fastest in the beginning of events, then each participant switched the Digital Self more than once in a minute

Use of Digital Selfs were not confined to ice-breaking. In addition to purely providing ice-breakers (see Section 5.3), the topics from Digital Selfs were rich and supported strangers to get to know each other. Participants were positive towards the statement: ‘I found the other persons’ Digital Self useful in getting to know him/her’ with mean value of 5.1 (SD = 1.3), which differs significantly (One sample t-test, p < 0.05) from the neutral score (4). Participants were comfortable using both topics from the Digital Self to incorporate into the conversation, as well as choosing topics from outside the Digital Self. 19 participants incorporated at least one topic from a Digital Self into conversation, whilst 21 incorporated at least one topics outside a Digital Self. As such, Digital Selfs supported conversation but their use did not dominate it.

Although we used a more controlled study design, participants did not feel under pressure to use the Digital Selfs. Having a more controlled study allowed us more detailed study of how Digital Selfs were employed. Our method was based on prior studies of face-to-face interaction (Tidwell and Walther, 2002; Nguyen et al., 2015), but raises issues if the events were too artificial.

To the statement ‘I felt that the situation for conversations was natural.’ participants responded with a mean score of 5.0 (SD = 1.8), which differs significantly (One sample t-test, p = 0.032) from the neutral score (4). To the statement ‘I enjoyed the conversations with the other persons.’ with a mean score of 6.1 (SD = 1.1), which differs significantly from the neutral score (4) (One sample t-test, p < 0.0001). We can conclude that participants did not find the situation unnatural as participants were able to mingle freely, and interact (or not) as they wished. Similar to a real event, participants were also able to remove the HMDs (two participants did this) and leave the event at any time (none did).

5.2.2 Browsing for individuals

Initial interactions

Eleven participants viewed Digital Selfs before initially approaching another participant. Figure 9 illustrates how many times and how often each participant accessed other’s Digital Selfs before their first interaction, as well as how these initial interactions formed the first subgroups in each event. Participants accessed Digital Selfs largely to gain some idea of who the other people were before initiating interaction (P18:‘It’s something I like to have, information about the other person before starting talking, that’s why I find concept itself fascinating, because it’s public information in a way’).

Figure 9
figure 9

Openings of Digital Selfs before the participants interacted. The boxes represent participants and their physical locations at the beginning in the room and the arrows represent how many times participants opened Digital Selfs. For example, in the first event P3 has opened P5’s Digital Self once. The red areas represent the first subgroups that formed

Whilst half of the participants viewed Digital Selfs before face-to-face interaction, this was often not to identify who to talk to. Only two participants, P3 and P17, took time to browse the Digital Selfs of all other participants before deciding on which group to join (P17: ‘I did look one-by-one at their profiles, digital profiles and based on those I decided which group, which pair to join.’). P17’s browsing can be seen in Figure 9 and also as a function of time in Figure 10. At time point A, P17 joins the pair P13-P14 after viewing everyone’s Digital Selfs. In other cases participants moved towards the nearest person, and it was this individual’s Digital Self that was accessed. It is likely that the ‘cost’ of joining the nearest group was lower. P16: ‘Yeah cause I was like this is a shorter way than going over there. So then I just took up the Digital Self and walked over there.’. This may be in part due to some events having only 3 participants as most of the lack of pre-browsing was in three person events (see Figure 9).Footnote 2 In smaller groups there is less reason to browse before interaction, as a conversation requires at least two participants anyway. However there are other reasons. 12 out of 23 participants chose not to access the Digital Self before interacting face-to-face. Participants found having to access the Digital Self too demanding at the start, and approaching a random person was easier for initial interactions (P15: ‘I first talked to a person because there was too much things going on that I could concentrate on. The pictures, the person, and I was like I don’t know what to do with all these things.’).

Figure 10
figure 10

Spatial positions of participants (above) and timeline of opening of Digital Selfs (be-low) in Event 4. Physical locations of participants are mapped into a timeline. In the timeline, one row represents one participant and shows when she/he opened other participants Digital Selfs (represented by colours). White area in timeline represents a situation where participant did not have any Digital Self open. Time point A: P17 has browsed for individuals. Time point B: sub-group P15-P16 join another subgroup P13–14-P17. Time point C: P16 incorporates a new topic from Digital Self into conversation. Time point D: All participants (except P13) have the same (P13’s) Digital Self open

The smaller group size we had clearly impacted on the amount of pre-browsing of Digital Selfs before deciding which group to join. Small groups were in part necessary due to both the number and availability of the HMDs, as well as the previous discussion on having Digital Selfs manually, rather than automatically, selected. We discuss this more in future work. However for small groups individuals access Digital Selfs more to support initial interactions than to make a choice over who to interact with.

Post initial interaction browsing

After the initial subgroups had formed, it was common for participants to browse the Digital Selfs of participants in other subgroups. Participants regularly accessed the Digital Selfs, contributing to the regular and more frequent switching observed during the early stages of the events (see Figure 8 for an average of switching instances over all events). This was used as a way to determine both at the start and during the event if it was potentially beneficial to join a different subgroup. In Events 1 and 4, where both subgroups merged during the event, every participant checked at least one person from the other subgroup’s Digital Self before the merging occurred. Digital Selfs were used as a means of evaluating the potential of interaction with others, com-pared to continued interaction with the person an individual was already talking to. This “sneak viewing” was possible as the Digital Selfs were delivered through private displays, so could be accessed without awareness by other members of the subgroup, helping to avoid any obvious sign that a participant wanted (or was considering) ending the current conversation (P5: ‘I had difficulties to see exactly where everybody was looking. So that helps also on the point that the other one can go through the pictures and somehow it’s not that rude, because you can’t see it.’).

Browsing of Digital Selfs can help individuals identify relevant others, and participants were positive in how self-curated representations could help them express themselves (P20: ‘...it [Digital Self] gives additional layer of way you can express yourself.’). However, participants also highlighted potential dangers if the Digital Self itself appears uninteresting to them and would thus discourage, rather than encourage, interaction (P9: ‘It’s a good way to start a conversation, but in a way it could affect people that they see each other’s Digital Selfs before starting the conversation. So they might think that OK, that person isn’t interesting so I won’t talk to him or her.’).

5.3 Ice-breaking

Participants usually moved towards the nearest person, forming the first conversational groups (see Figure 9 and Table 1). Nine initial groups formed across all events. In 5 of these, Digital Selfs were referred to within the first 10 s from the start of the conversation. In this way the Digital Self was considered as publicly available common ground, acting as ‘tickets’ (P2: ‘I usually don’t start any conversation with anyone in the social events, so I think it would help me to start if I know something’, and P5: ‘We have the same background somehow, so it was much easier to ... make up some stuff, because you had the pictures and you had the texts.’). Digital Selfs in general were perceived as useful for obtaining tickets (P22: ‘You have certain amount of topics which you know that the other people are interested, so when you start a conversation on a certain topic, which someone is interested, definitely that person will respond positively because you know that he or she’s interested in the topic, so it will be a very good start-up for the conversation. So that was really helpful.’). Digital Selfs was also found to be useful to help those who would normally not interact with stranger to do so (P7: ‘I don’t usually talk that much with strangers, or any people. So it was a nice experience.’). There is evidence that topic suggestions would be more useful for introverted individuals (Nguyen et al., 2015). However, in a standard personality test (John and Srivastava, 1999) our participants had a mean extraversion score of 3.0 (SD = 0.9) on a scale of 1 to 5. We did not find a statistically significant correlation (r = −0.17, p = 0.44) between the usefulness of the Digital Self and an individual’s extraversion score.

Whilst the Digital Self was generally perceived as useful for starting conversation and familiarising strangers, five participants did not feel that they really needed it (P9: ‘I thought the digital self was quite unnecessary. It felt like it was only in the way. And that I really wanted to get the conversation going just by myself and that it was only sort of like disturbance on the side...But I don’t think the Digital Self changed the conversation that much...it felt like it would have gone the same way even with or without the Digital Self.’). It was largely the content of the Digital Self that impacted its perceived usefulness. Content that presented very clear information, or which did not lend itself to discussion, tended not to be used. For example, P9 interacted with P12, who had only one image in his Digital Self (P9: ‘I opened it in the beginning, but then I think we already had started talking about things that were there [in the Digital Self] and it didn’t really give anything new to the conversation.’). Therefore we believe how an individual constructs his or her Digital Self has a greater impact on its use as an ice-breaker, than an individual’s level of extroversion.

5.3.1 Digital self use supporting conversation

Accessing during conversation

After initial ice-breaking, Digital Selfs continued to be accessed and referred to during conversation. Digital Selfs were used to provide new topics to continue the conversation. For example, when silent moments occurred or discussion on a topic naturally came to an end (P6: ‘When the conversation is really going good, and you are finding the topics of mutual interest, then you don’t look at the digital self. But if you think you are running out of the topic, then you might go to the digital self of the other person so that you are going to find new things. But it only helps when the conversation is detracting and you are not finding anything new to talk. But if the conversation is really good, you don’t care about the digital self at that point’, P9: ‘And at one point or another I think when there was a silence I checked them again.’). However, this also extended to ongoing conversation, where participants would access the Digital Selfs of other members of their subgroup to identify new topics to ‘pivot’ the conversation towards, and away from the current topic that may interest them less (P12: ‘After the conversation started, after having ice breakers and when the conversation was going, then I went to see the Digital Self, and I guess it, during the conversation, I guess for two times, I saw it so that if I can dig for a new topic of conversation.’). Time point C in Figure 10 illustrates this behaviour, when a topic from the Digital Self was incorporated into conversation. P16 cues content from a Digital Self into conversation after considering it for one minute. This causes other participants (including the owner of the Digital Self) in the same subgroup to access the referred to content and continue conversation from it. In this way all parties switch to the Digital Self, using it to act as a common ground to pivot the conversation topic. Another practical reason returning to Digital Selfs was obtaining names of others when available.Footnote 3 It is common that people do not remember the names of other people they meet, especially where there is more than one name to remember (McWeeny et al., 1987). The Digital Selfs that contained names helped in this (P1: ‘I think it was useful to remember the name, if I could see it, for example your name I can remember, but I already forget your name.’).

When participants chose to access and browse the Digital Self of others was largely driven by their current role in the conversation. Participants who were active (either speaking or being directly addressed) focused on the conversation and did not change from the Digital Self they currently had active. Only when participants were unaddressed recipients did they access and switch between Digital Selfs (P5: ‘I found myself looking at them when I wasn’t part of the conversation, when other people were talking then I took some time to browse through them and read the text.’). Being active in the conversation required full attention by participants, and accessing the Digital Self was considered to be too demanding (P10: ‘I wanted to listen to the person at the time, but then look at the pictures, I’m a simple human being so it’s very hard to concentrate on just one [picture in Digital Self]’). In group conversations, participants had more time to view Digital Selfs (P14: ‘Especially if you want to watch the other person or listen to what they’re saying and look at the Digital Self, then it was too confusing. But in a group you get more time to look at the Digital Selfs. But even then you cannot talk and look at it at the same time.’). Whilst manual support to interact with a Digital Self (supporting the ‘sneak viewing’ previously discussed is important), when engaged in conversation, Digital Selfs require an automatic approach, sensitive to the cur-rent conversational role of the participant.

Topic selection

The topics that participants discussed were similar to those identified from non-augmented conversation between strangers (e.g., interests, where the person lives, what they do, education, occupation, social relations, places they visited and travelling) (Svennevig, 2000). The Digital Selfs did not substantially change the ‘topic space’, but did widen it, making more topics visible and accessible, allowing involving (deep) topics (Svennevig, 2000, p.91) to emerge. Thus, the Digital Selfs provided possibilities for participants to move out from conventional interview style discussion (e.g. ‘What do you do?’ and ‘Where do you come from?’ (Svennevig, 2000, p.91)) towards deeper rich topics that supported more meaningful disclosure and conversation on areas participants were already interested in. For example, transcripts in Figures 13 and 14 illustrate how the conventional interviewing questions were skipped by incorporating questions about the images in a Digital Self.

Whilst participants incorporated topics from the Digital Self, they did not feel under pressure to do so. Often, they felt that coherence in the conversation, and supporting its natural evolution was more important than dynamically changing the topic (P13: ‘I thought in the beginning, that 45 minutes is a really long time to just come up with conversation topics, but in the end it wasn’t. Cause once we got the conversation going and we joined as a big group, then it just went on.’). This revealed a tension, where participants would have liked to talk about a highly relevant shared interest, but did not want to disrupt the flow (P16: ‘Conversation was ongoing somewhere else. You had swimming and I love swimming, I mean if we were just the two of us in a room, that would have been the first thing to pick.’). In addition, some Digital Selfs that contained only basic information became quickly exhausted, and participants found there was nothing new to incorporate from them, so there was no need to try to pivot to something new (P9: ‘I used it in the beginning, but then I think we already had started talking about things that it was already going and it didn’t really give anything new to the conversation.’). It may also be the case that after some time the Digital Self has supported conversation through setting talk and onto rich topics, that conversation becomes self sustaining, and Digital Augmentation (at least in the static representation we used here) becomes less useful.

Joining conversations

Digital Selfs were also used when an individual joined a pre-existing group (in addition to when participants ‘sneak viewed’ Digital Selfs outside their current group). Whilst we witnessed a limited number of participants who joined an ongoing conversation (6 participants over all events, 3 in Event 1 and 3 in Event 4), in five instances Digital Selfs were accessed. This was either by the person who joined the group (allowing him or her to integrate with the group) or by other members of the group (to find out more about the new attendee).

Figure 11 illustrates the former, with P3 joining the group P5 is a member of. P3 has looked at P5’s Digital Self, where P5 has written his name, using it to integrate into the group. Other members of the group then access P3’s Digital Self to find out more about him. Thus we argue that Digital Selfs facilitated establishing common ground between newcomers and original members of conversation, and Digital Selfs enable integration of newcomers into conversation through Digital Selfs.

Figure 11
figure 11

An example from Event 1 on how P3 joined a group after browsing Digital Selfs for the most appropriate group to join

Cueing digital self information

When Digital Selfs were used to open conversation, they were most often explicitly referred to, with participants ‘cueing’ that they were referring to them (P11: ‘So I see from your profile that you like to avoid political matters. Why is that?’). As the HMDs excluded participants directly showing what they were referring to, they either cued the use of the Digital Self verbally, or pointed towards the HMD (indicative gestures (Clark and Brennan, 1991)). When participants referred to particular content within a Digital Self, this was done verbally using the layout, as illustrated in Figure 12. Participants had to engage in management work to ensure everybody understood which Digital Self and which content in it was being referred to. The importance of this was highlighted in cases where participants did not explicitly cue information. In such cases this caused a breakdown in the conversation with the other parties, necessitating work to repair the conversation through re-establishing common ground (Clark and Brennan, 1991). Repair was often accomplished by providing the missing cueing of where the information came from. For example, Figure 13 illustrates this.

Figure 12
figure 12

Example of using layout in communication

Figure 13
figure 13

An example from Event 6 where common ground broke down due to a lack of explicit cueing of the Digital Self, and the consequent repair

Such issues were largely caused by asynchronous opening of the Digital Selfs, and can be seen in Figure 10, where the same Digital Self (P13’s) is open only once (at point D) for every participant, except P13 herself. Prior work, focusing on one-to-one conversation has not identified these issues, since there are no ‘alternative’ visualisations to view. Whilst sensing technology and control mechanisms could be employed to ensure all participants in a group are viewing the same Digital Self, as discussed in Section 5.2.2, this may also have negative impacts. However, an ability to quickly synchronise the Digital Self all participants see would remove much of the management work of cueing content.

5.3.2 Media use from digital selfs

When Digital Selfs were incorporated into conversation it was usually done through images. Par-ticipants did not just ask about contents of the images in Digital Selfs, but also how the images were taken or created. 49% of all images in the Digital Selfs were selected for topics in the conversation, whereas only 18% of text instances were referred to.

Text data was often found to be basic and self-explanatory, and was not felt by participants to provide a rich topic of conversation (P13: ‘His was self-explanatory, so I didn’t feel the need to talk about them anymore.’). However, this also extended to more general adjectives describing a person (e.g., ‘easy-going’, ‘energetic’ and ‘optimistic’) which were also left out from conversation. Participants wanted to make their own interpretations of other people based on interaction, rather than be told what a person was like (P11: ‘It’s not so interesting to get to know new people if you already know something about him or her, so you don’t have to dig all the specialties about the person if you already see the things he likes or doesn’t like.’).

Images, on the other hand, provided much richer, more ambiguous conversational possibilities. Asking about images in another person’s Digital Self (see Figure 14) were the most common source of ‘tickets’. Images were both concrete enough to formulate a reasonable question about (P23: ‘I have used like the picture of the pyramids. The other people will be interested like, oh, what is this pyramid. So it tends to initiate more conversations.’), yet ambiguous enough that they were seen to stimulate a rich conversation (P10: ‘Picture [in Digital Self] was puzzling at first, but it was something to start the conversation with.’ and P13: ‘The information alone was not enough, but when paired with the individual during the conversations it helped a lot.’). One participant who used images described these issues (P14: ‘My content [in Digital Self] was quite simple, but there’s a long story behind it.’).

Figure 14
figure 14

An example from Event 4, where content from the Digital Self of P15 was used as an Ice-Breaker

As discussed, where individuals did not use Digital Selfs as ice-breakers, it was in part down to use of basic or textual content in the Digital Self (See Section 5.3). This extended to instances where the Digital Self was not used in conversation as it was too basic. Whilst we saw substantial usage of Digital Selfs, we would expect greater usage if the Digital Selfs were richer, incorporated more ambiguous content and were more image based. As discussed in Part 1, such rich and ambiguous Digital Selfs were more valuable in supporting conversation.

5.3.3 HMD issues

We chose to present Digital Selfs on Head-Mounted Displays, and whilst these are not the only possibilities (see Section 2.2.2), they provide good collocation between the face and Digital Self. In our study, their private nature also supported ‘sneak viewing’ of other’s Digital Selfs.

However, participants did raise issues with the current generation HMDs that we used, finding at times the relatively thick glass between the user and their eyes distracting (P9: ‘I wanted to see the person’s eyes when I’m talking to them, then the glasses are, you know, the screens are kind of a little bit covering up the eyes.’). This also meant that participants could appear distant (P15: ‘Yeah it felt like, like talk to a person then actually looking like there and like yeah. Like because when you have conversation you have to have the eye contact and be like present, so it felt weird that the other person is somewhere in this weird different world.’). We did identify a few instances where this lack of eye-contact created confusion between participants on who was being addressed. An example of addressing problems is seen in Figure 15. Whilst this is an issue of current HMD technology, the eye-contact will be become clearer as the technology evolves. Overall, these issues did not significantly disrupt face-to-face interactions, with participants neutral (mean score 4.4, SD = 2.0) towards a statement ‘The smartglasses did not distract me from the conversations’.

Figure 15
figure 15

Example of problems in addressing the speech

6 Discussion

Our study has generated significant new insight into how individuals choose to augment them-selves with digital media in multi-party interaction. By curating representations themselves, and studying their use in multi-party situations, we have gone significantly beyond the current state of the art that has, so far, largely focused on automatic selection of media in strictly one-to-one interactions. In framing our discussion, we do so around our two research questions.

6.1 RQ1: how do individuals choose to represent themselves to strangers with digital selfs?

Participants strongly favoured images over text content to incorporate in their Digital Self. Only a minority of participants chose text only Digital Selfs. Additionally, the majority of images participants chose to include came from outwith existing social and digital media accounts, with over 70% being sourced from a Google image search. This is surprising given the emphasis of existing work (Chen and Abouzied, 2016; Jarusriboonchai et al., 2015; Nguyen et al., 2015) which both focuses on matching or using social media from existing accounts, and presenting those matches to users as text. Our results indicate that whilst these techniques identify common interests, they are unlikely to represent those aspects of self that individuals would wish to present publicly to strangers.

The use of such images was mostly due to participants wanting to express ambiguity in their Digital Self, and images were a good way to do this. Participants considered that the interpre-tation of the image provided only some insight into its meaning, and how it represented them. This meaning could be further disclosed through conversation, facilitating the use of the Digital Self in boundary regulation (Lampinen, 2014), and allowing an individual to dynamically man-age disclosure through conversation. From Part 2 of the study there is also evidence that such ambiguity in presentation through images stimulates and supports conversation more than provid-ing simple textual information. Such Digital Selfs were found to be useful through all stages of conversation. The lack of use of Digital Selfs by some participants was at least in part due to the information in Digital Selfs being textual and unambiguous (name, likes, personality, occupation, etc.). Participants found these to be unstimulating, and often didn’t use them or incorporate them into conversation. Whilst this is not the only reason participants chose not to use Digital Selfs, those that provide only basic facts or interests are unlikely to enhance interaction. Unlike prior work, such as Nguyen et al. (2015), and because we had multiple parties and covered all stages of conversation, there was less explicit focus on use of the Digital Self as the ‘task’, and participants felt able to not use it when they felt it wouldn’t help them. Further study to show this is required, but participants should be encouraged to create richer and potentially more ambiguous Digital Selfs if they are to have greatest benefit. In avoiding automatic selection, identifying only those who are similar, self curation avoids isolation from others that are dissimilar and potentially contributing to a ‘filter bubble’ (Resnick et al., 2013). As with online media, digital augmenta-tions with the same automatic recommendation algorithms may simply keep like-minded people together, whilst as noted by Mayer et al. (2015), individuals may often be open to meeting others who are dissimilar. In many work situations it is often necessary to work with others who are dissimilar, and understanding those differences may be as important as identifying similarities.

Using these richer and more ambiguous representations may also reduce the barrier to interaction with others, providing not only a ticket to commence an initial interaction, but opportunities of setting talk that have greater potential to lead to richer interactions. Such media may pro-vide accelerated opportunities to establish better awareness of others and common ground with them. Within CSCW, this may be a practical approach where ad-hoc teams must form quickly and effectively in a short time period. For example, Wong and Neustaedter (2017) have identified how cabin crew must quickly learn about and trust one another in a safety critical environment. Similarly Lykourentzou et al. (2017) notes how having a deeper relationship with team members before those teams are formed enhances task performance. Although we have focused on more social scenarios, and the media chosen may well be different, the use of ambiguous media may provide an effective way to support this relationship forming between team mates faster.

6.2 RQ2: how are digital selfs used at each stage of interpersonal interaction between strangers in multi-party settings?

Existing work has largely focused on supporting individual stages of interaction, and evaluated work has focused only some of these (e.g. (Chen and Abouzied, 2016; Jarusriboonchai et al., 2015; Nguyen et al., 2015)). Existing solutions are unlikely to support all three stages. For example, whilst Jarusriboonchai et al. (2015) badges present simple information to support ice-breaking, they are unlikely to be useful during conversation. Nguyen et al. (2015) focused only on supporting conversation, and although their system provided new topics to discuss, it did not support evolving conversation beyond basic interaction. Our work found that Digital Selfs were employed, and were useful at, all stages of conversation. Most importantly, and unlike prior work, they are also useful after the initial ice-breaking phase and may support moving onto rich topics. Whilst Digital Selfs were still accessed when participants had established conversation, the frequency of accessing and switching reduced during the course of the study. This indicates that conversation had reached richer topics of which there was more to talk about. This is unlike Nguyen et al. (2015), which shows more referrals and use of topic suggestions as their study progresses. We argue Digital Selfs supported richer conversation, which did not need augmentation to sustain, whilst towards the end of interaction Nguyen et al. (2015) automatic topic selections were used to simply keep talking rather than those conversations being insightful. Again, this highlights how Digital Selfs might support individuals who must work together (e.g. (Wong and Neustaedter, 2017)) to form better relationships based on aspects of their interest, increasing common ground between them, and if not forming closer personal relationships, at least gaining a better understanding of others.

Whilst participants used Digital Selfs at all stages of interaction, not all participants used them at every stage. It was not the case that some participants did not use Digital Selfs at all. Participants used Digital Selfs when they felt it would be beneficial, and felt able to ignore it when they did not. There were multiple reasons why or why not this was done, but the Digital Self did not dominate the interaction, nor replace existing conversational practices. If participants were engaged in an interesting conversation, the Digital Self would sit ‘on the side’, with topics participants found and wished to introduce left until a suitable point emerged. Participants were comfortable using content as an ‘ice-breaker’. Either starting conversation directly with it, or introducing themselves before. Our event is an example of an ‘open-region’ (Goffman, 1963), where there is an assumption, due to the context, that interaction is permissible. This is true in all other formal studies of face-to-face augmentation (Nguyen et al., 2015; Chen and Abouzied, 2016; Douglas, 1990; Maynard and Zimmerman, 1984). It is challenging to study such systems outside of an ‘open-region’ environment, where such assumptions cannot be made (e.g. at a coffee shop), but there would significant value in doing so and understanding how this might impact their use in ‘ice-breaking’. Digital Selfs also worked as ‘advertisements’ for participants, as they guided which conversation to join. Digital Selfs played a key role in establishing common ground when participants (newcomers) joined conversations. Newcomers accessed the Digital Selfs of the members of their ‘new’ conversation group before joining, and newcomers were integrated into conversations through Digital Selfs by incorporating a conversation topic from that newcomer’s Digital Self. Again, in relation to more general CSCW, there is potential to support on-boarding of new team members of colleagues faster into existing groups. A potential future avenue of investigation is to consider how Digital Selfs perform in creative environments, such as academic departments or maker spaces, where individuals are open to new collaborations with others. Existing work has investigated public displays (Bilandzic et al., 2013) for this, but there is value in considering how co-located representations of self (such as current ideas or work) might support this, and leverage on the ‘advertisements’ identified by participants.

Whilst participants did browse others before starting interaction, and again before merging groups, we observed much less of this behaviour than we expected. We argue that in small multi-party gatherings there is less need to be selective on who to talk to. In the 3 person groups we observed fewer instances of browsing Digital Selfs before moving to ice-breaking, although Digital Selfs were used during ice-breaking for these groups. In the 4–5 person groups we saw more evidence of this ‘pre-browsing’, and we would expect more in even larger groups. All participants started interaction at the same time at the beginning of the study. This comes from existing study methods (Nguyen et al., 2015; Douglas, 1990) that we used in our study, and a desire to ensure everyone started from the same position. However, it is somewhat unlike how individuals would enter such gatherings (such as parties), where individuals would arrive in a staggered order. Therefore there would not be the situation that all participants would arrive at the same time (although multiple people may still arrive at the same time). From participant comments, dealing with the Digital Self at the start of the event could also be demanding. Participants found it too much to both browse the Digital Selfs and decide who to talk to. We carried out our study in a controlled way to better consider how it related to existing one-to-one studies (such as (Nguyen et al., 2015)), and to ensure we focused on interaction with strangers. However, there is value in study of applying Digital Selfs to pre-existing events (which may have a mix of friends and strangers) to better understand how they are used for initial browsing of others and choosing who to interact with.

To do this it is necessary to consider automation support to access and move between Digital Selfs. Such issues, due to the focus on one-to-one interaction where there are no alternative augmentations have not been uncovered in prior work. We allowed only manual selection of the Digital Self as we did not want to constrain how they were used by enforcing an automatic system to switch between them. In many cases manual selection was important. For example, the ability of participants to ‘sneak view’ the Digital Selfs of individuals outside their current group, or where an individual was an ‘unaddressed recipient’ (Gibson, 2003; Traum, 2004), and had time to look for interesting topics that they may wish to pivot to when a natural break in the conversation occurred. However, it was also clear that fully manual selection will not work in all places. In addition to the outlined issues on browsing at the start of interaction, manual selection often led to breakdowns during conversation. Digital Selfs acted as an invisible ‘layer’, providing resources to the conversation. Although participants were explicit when cueing information that came from the Digital Self (through explicit utterances or gesturing towards the HMD), participants were often not all viewing the same Digital Self. This led to a collapse of common ground, requiring management work to access the correct Digital Self to ‘repair’ (Clark and Brennan, 1991) and re-establish common ground. In such cases an automatic approach, or ability to quickly ‘sync’ a common Digital Self amongst group members, would be beneficial. Further work on mechanisms to do this is required, incorporating sensors to determine automatically who is in the participant’s group, and what the user’s current role is (e.g. is he or she currently talking) is one approach. Existing CSCW work (such as (McCarthy et al., 2004; Chen and Abouzied, 2016)) has considered only one-on-one interaction, and has not yet considered how groups form, reform and interact. Our work here expands on this to identify how digital augmentation is used in these phases, and will help develop mechanisms that support automatic switching between augmentations. Such mechanisms are not trivial, but are essential to further study Digital Selfs in the larger group scenarios we previously discussed.

6.3 Future work

Our future work is largely focused around addressing the issues in the discussion. By developing better support to synchronise and move between Digital Selfs, we will be able to study their use in larger groups of individuals and augment existing events. By encouraging participants to generate richer and more ambiguous Digital Selfs, we will be able to improve our understanding of how they can be used in browsing for individuals at the start of interaction. However, such events are largely a mixture of strangers and individuals with a prior relationship (e.g. friends). Prior work, including our own, has largely focused on strangers, yet an interview study that formed our initial ideas and motivated this work identified that that there is a need to tailor contents for different audiences (McGookin and Kytö, 2016), as does existing work on social media management (Farnham and Churchill, 2011). Individuals may wish to create different Digital Selfs for different relationships, yet interaction may occur in a mixed-group. Further study is required in different situations and prior relationships between individuals to strengthen and deepen our understanding of digital augmentation of interaction. Whilst we argue there is value in general interaction, it is clear that Digital Selfs may be valuable in a number of more focused, work related environments. Although the content individuals choose would obviously differ from the more general social scenarios we used here, there is value in applying our approach to situations where personal relationships between team members are both important, but where the work situation constrains the time to develop them (e.g. (Wong and Neustaedter, 2017)).

7 Conclusion

Our work has been the first to consider how digital, user-curated, representations of self can be incorporated into multi-party face-to-face interaction. Unlike prior work in both HCI and CSCW, that focuses on algorithmic matching, potentially causing ‘filter bubbles’, and does not consider how users would wish to be represented, we identified that media users choose is often ambiguous and comes from outside the existing digital and social media services that algorithmic matching employs. Our study also identified that user curated representations are effective at all three stages of face-to-face interaction, but did not dominate conversation. Prior work has only investigated a subset of these and has not considered the importance of their ‘non-use’. Our findings have significantly advanced knowledge into the emerging field of how face-to-face interaction can be augmented with digital media to support interaction.