Keywords

1 Introduction

The adoption of smartwatches is increasing quickly, with the estimated 18 million units in 2015 [48] up from 6.8 million in 2014 [49] and moving towards Gartner’s estimate of 50 million in 2016 [47]. This rapid adoption presents some interesting user research questions. For example, what does a wrist worn technology add to current practices of technology use? How does a smartwatch change the social interaction we have learned to build around other form factors such as the smartphone, tablet, personal computer, or connected television?

We provided 12 participants with a smartwatch for one month, with the final three days of use recorded with pairs of wearable cameras. These recordings were combined with interviews with each participant. This data gives us a uniquely detailed view on how smart watches are used, what for and in what context. From 34 days of recording we have over 168 h of recording, including 1009 incidents of watch use – or around 6 uses per hour, with each use being on average 6.7 s long.

In previous work we have used this corpus to examine how the smartwatch fitted into, and changed, our participants daily routines [38], and how the social context and wearer activity influences how the watch is used [30]. Here we take a closer look at how the watch is used in conjunction with, and alongside other technology.

2 Background

The current generation of smartwatches from Pebble, Apple, and the variety of manufacturers building on Google’s Android Wear platform has been increasing in popularity [39]. Running in parallel, there has been an increase in the amount of wrist worn non-watch wearables – primarily aimed at health, fitness, and the quantification of personal action [43]. This form factor has seen a lot of research in the HCI community. The mechanics of touch on small devices have been examined in detail [5, 19, 20] as has text entry for small devices [9, 12, 21, 25, 35] as well as other input modalities such as tilting and twisting the screen [46], tracing letters on other surfaces with your finger [34], interacting around the device [26, 36], interacting with just gaze and attention [1], and even blowing on the watch [10]. Given the amount of work done on input and output from smartwatches, there has not been a great deal on the use of the smartwatch. Lyons [27] looked at traditional watch wearers’ practices in order to learn lessons that could be applied to the smartwatch. Giang et al. compared the amount of distraction notification’s caused from smartwatches and smartphones [13], Cecchinato et al. [8] and Schirra and Bentley [42] interviewed smartwatch wearers to better understand how and why they used the device. Schirra and Bentley emphasised the importance of notifications as a watch function, as well as the importance of appearance in choosing a watch device. The smartwatch finds itself in an already busy technological space, where users necessarily have a smartphone to provide connectivity to the watch. Different constellations of devices in concurrent use have been explored [22], as well as the reasons and motivations for such use [11, 37], such as the suitability of the device to the task or environment. Tasks are divided between devices, with each one having its own role in the pattern of use [11, 14, 40]. Such patterns can be sequential [24], however parallel patterns have been shown to be more common with modern devices [32, 40]. Although these practices have been shown to be different between individuals and professional groups [11, 24, 40]. A common task, and difficulty, has been managing content across devices [11, 32, 37, 40], including the synchronization of meta-data such as interaction history [11, 23, 37].

The configuration of the set of devices changes with person and location. On the go this set of devices has been referred to as a Mobile Kit [28], however managing all the different device configurations can require significant effort and planning [37].

In HCI there have been a variety of systems, interfaces, and interaction methods proposed to support computing with multiple devices, and while these are generally not directly supported currently on the watch platform they provide an interesting lens through which to see multi-device use. One avenue explored was that of binding devices together, ranging from scanning for available proximate devices to connect wirelessly to physical methods, such as synchronous touching of a button or alignment [45]. Transferring content between screens as a result of direct user action [33], and migratory interfaces [2] exemplify techniques for moving between devices. Managing and switching tasks in multi-device computing environments has also been explored [3]. Collaborative systems with multiple devices and multiple users have also been developed [44].

3 Methods

To understand smartwatches in vivo, we wanted to record how the watches were used, what they were used for, and the wider context they were used in. Tracking of mobile devices, using logs and other means, has grown in popularity as devices have become important parts of almost all aspects of life [6,7,8, 18, 29]. Our use of naturalistic video recording of mobile device use has advantages in that it affords the opportunity to capture and analyse the moment-by-moment details of how the environment, the people, and device are connected to the use [4]. For this study, we had participants wear multiple portable wearable cameras that recorded their actions relatively unobtrusively, while allowing us to see and understand smartwatch use “in vivo” [29]. We made a small ‘sensor bag’ which contained two police issue cameras with long-life batteries that allowed them to record for eight hours each. One of the cameras was directed to record the scene around the participant (pointing forward). The second camera was connected to a small ‘stalk’ camera that was mounted at the shoulder of the participant (looking downwards), so as to capture the participant’s body and wrist where the watch was attached.

Full details of the data collection and analysis can be found in [30, 38], in summary we recruited twelve participants with a median age of 30, with five out of the twelve participants being students. Seven of the twelve participants were female. All participants regularly used an iPhone, and had not previously owned a smartwatch. Three of them regularly wore wristwatches. Participants were given an Apple Watch, with a choice of small (38 mm) or large (42 mm), and were asked to use it for a month. The last three days of use were recorded using our camera setup. On the third day, the cameras and the smartwatch were collected by a researcher and, for most participants, an interview was carried out there and then.

Our analysis started by watching the video and extracting clips for each interaction with the watch. Each clip was logged with details including who was present, the type of watch interaction, other devices being used, and the length of interaction. For our analysis sessions, all authors collectively watched all ‘watch use event’ clips, around 8 h of video in total.

For this paper three clips have been selected for closer analysis clips, these were deemed to be particularly insightful of the nature of watch use alongside the participants’ use of other devices. We drew on interactional analysis and the broader body of work in HCI that looks closely at the moment-by-moment interaction with technology [7, 13, 15,16,17]. Each extract is thus looked at as an individual, unique incident of use – but also inspected for exemplifying patterns that we can extrapolate to be present in other situations.

4 Results

We focus on three detailed analysis of examples of use. The first is the glancing interaction, where the watch is looked at as a timepiece or to quickly sate the users curiosity over the topic or origin of an incoming notification. These interactions are generally short (under 3 s) and do not involve interaction with the watch. The second is a watch initiated short interaction, mostly resulting from incoming notifications, that involves interaction with that message or clearing the message drawer. The third type of interaction we look at here is where the watch is an instigator of further social interaction.

4.1 Glancing

Figure 1 shows an example of the shortest of the watch-initiated types of use, the notification glance. By looking at the detailed timings of this interaction we can start to unpack this interaction and how it fits into the on-going task of working with the laptop. For the sake of space and readability, in this paper we have removed the second camera view which, in this case, showed part of the laptop screen and a train carriage. One point to notice is that this interaction, from the moment that the wrist starts to move until the participant has resumed working on the laptop, takes 3.34 s. The time that the user takes to decide that the message is of lower priority than the task currently at hand is somewhat less than 1.45 s – the time between the animation displaying the message beginning and the participant starting to dismiss the notification, and the watch, by moving their hand back to the keyboard. The notification on the Apple watch comes in three stages; the first stage is the indication that a notification is incoming (given with haptic feedback on the wrist, an audio alert, or both). The second stage is the first part of the animation where the message source is displayed with the icon of the application triggering the notification. In the third stage the notification digest scrolls onto the watchface, showing the sender (if applicable) of the message and a message digest consisting of the first 100 characters if text or a thumbnail for image data. In our corpus 19% of incidents of use (178 clips) are of glancing at an incoming notification, with a median length of 6.69 s.

Fig. 1.
figure 1

7.29 s into the clip, wrist starts to move. Gesture complete at 7.86. Message animation complete at 9.26, gesture to move back begins at 9.31. Typing resumes at 10.63

In the example shown in Fig. 1 the participant responds to the first stage by bringing their watch up, waits through the second stage (not dismissing the notification based solely on its source), and only after the message digest is displayed dismisses the message. Given the timing of the user action and an estimated reading speed of around 3 words per second [41] with small text on mobile devices we can be reasonably sure that this message was dismissed on the recognition of less than 3 words, suggesting that the title or the sender were enough for the participant to, in this case, decide that the message didn’t need immediate action on her part.

4.2 Maintenance

For our second example, shown in Fig. 2, we are also examining the use of the smartwatch alongside the users laptop computer. However in this case the use is on the couch at home and the participant is simultaneously watching television. This is a much longer interaction, taking 26.24 s in total.

Fig. 2.
figure 2

Watch lights up unexpectedly due to user movement at 1.53, User tilts hand at 3.56 and noticing ‘red dot’ raises wrist to interact, swiping to reveal notification at 4.9. The message digest is tapped to show the full message at 9.6, with the crown used from 10.8 to 22.8 to scroll and read the message. At 23.59 the user taps to dismiss, then at 24.9 uses the physical buttons to (inadvertently) take a screenshot in an attempt to return the watch to sleep. The previous activity is resumed at 29.8.

Here the user is engaged in maintenance work with the watch, specifically the notifications tray which holds notifications until they are dismissed (read and unread). As the user types the motion of their hand triggers the watch, possibly tricking the accelerometer based algorithm tuned to detect the action of raising and twisting the wrist to see the time as one would with a traditional watch when the watchface was out of the line of sight. Part of the watch interface is the ‘red dot’ which appears top and centre of the watch face to indicate that the notification tray has an unread notification in it – which happens when a notification is pushed to the watch and the user does not activate the watch screen within 20 s to trigger the notification animation as shown in the previous example.

The activation of the screen caught the attention of the user, and as they glance at the screen the Red Dot is visible. This triggers the reorientation to the watch and preparation to interact – positioning both hands in such a way that the watch screen is available for touch interaction.

Swiping down on the watch face brings up the notification drawer, and further swiping allows the participant to scroll through the notifications stored there. This 4.4 s activity is aided by the interface, in this screen the read notifications are shown with slightly muted colours to distinguish them from unread messages. Upon finding the brighter unread message the user taps it and is then able to scroll through the text of the message using the rotating bezel on the side of the watch, an activity which takes a full 12 s. This is a short message, yet despite this the participant scrolls up and down twice as if she has lost her place while reading the message. From this we can extrapolate that the television in the background is competing for her attention. Since this whole sequence of maintenance was not triggered by an incoming message this lack of urgency gives it a lower priority than the similar interaction we will see in the next example.

This is an example of a common maintenance task that we see with the Apple watch – clearing the red dot. Given that one of the most compelling reasons for owning a smartwatch is the notification system allowing for ease of triage, the managing of missed notifications was a surprisingly common activity comprising of 9% (88 instances) of smartwatch use recorded in our corpus, with a median length of 16.57 s.

The end of this sequence highlights a common problem that almost all of our participants encountered – confusion around turning the watch ‘off.’ That the default state for the watch involves the screen being off means that this is the state our participants wanted the watch to be in when they ended their interaction with it. In much the same way as they would press the lock button on their mobile phone to turn off the screen a number of attempts to do the same with the smart watch was observed. The methods to turn off the screen of the watch are to turn your wrist away from your body (in which case it is not turned off while still visible to the user) or to cover the watchface with the palm of the opposing hand. The watch, however, has two seldom used physical buttons. In the operating system version current at the time of data collection the rotating crown could be pressed to bring up the application launcher screen, and the button on the opposing side of the watchface could be pressed to bring up a quick-menu for contacting friends (much like a speed-dial). Upon pressing both buttons simultaneously a screenshot of the watchface is taken and transferred to the paired smartphone’s photo application. In our data all three of the actions bound to the physical buttons were activated more in attempts to turn off the screen of the watch than in on purpose. This shows that, as well as the physical buttons not being bound to the most useful actions, patterns of use and interactional expectations are transferred from current touch-screen (specifically smartphone) practice to this new device. The envisaged use of the watch does not include the ‘turn off’ action at the end of an interactional episode, instead this would be inferred by the position of the hand or the device being covered by clothing. This mismatch of designer expectations and user training may be expected to be ‘solved’ by retraining of the user of the new device however this was not seen even after a month of constant use of the smartwatch. One reason for this could be that the ‘wrong’ pattern of use was continuously reinforced by the participants daily use of their other devices – mainly smartphones and tablets – where this was, in fact, the ‘correct’ pattern of use.

4.3 In Conversation

In our third example, shown in Fig. 3, we see one of our participants on the couch with her partner with a blanket over them watching television in the evening. The blanket, as it happens, covered the second camera leaving us with one angle of this interaction.

Fig. 3.
figure 3

Watch raised at 1.6, notification tray swiped down at 3.2. Notification tapped at 4.0 and scrolled down until 8.3. The user then scrolls back to the top of the notification (9.53) and proceeds to read the message to her partner until 18.2, gesturing to the message with the hovering finger. 20.2 she taps dismiss, 20.9 presses the buttons to return the watch to its rest state.

While the television program is playing the wearer receives a notification on the watch of an incoming text message. This is ignored until the currently absorbing scene on the television is over, and attention can be turned to the watch. By then the period in which the watch would automatically open the incoming message on activation was over, so our participant opened the notification drawer and immediately tapped the new message to open it. After silently reading through the message once, which takes just over 4 s, she proceeds to scroll back to the top of the message and read it out to her partner. The message, from her mother, talks of another live television program on at that time showing scenes of her hometown. This sparks a short conversation with her partner on the topic of this television program during which she taps the message to dismiss it and (as in the previous example) attempts to turn off the watch screen. The short conversation concludes when the sound level of the television program they are watching together rises in the background causing both to return their attention to the program. 28% (272 clips) of watch use in our corpus was in the presence of others, with 15% happening while the wearer was in conversation with a co-present other.

This example shows two interesting things about the smartwatches place in the constellation of devices used during second screening. The first is that the incoming notifications are not always immediately attended to, even through they are more personal (by being on your body) than those arriving to a mobile phone. The participant is able to fit the reading and managing of incoming notifications around the rhythm and flow of the program. The second is that, as has been noted previously [7], mobile devices have a privileged place in conversation, providing topics as well as tickets to talk about those topics. The couple here are able to manage their conversation and use of technology around what is happening on screen, as opposed to simply being distracted from the on-going show.

5 Discussion

The short glances, of which clip one above is an example, when including the 50% user initiated glances to check the time or to check for notifications, make up 72% of all instances of watch use we recorded. The very short interactional glances in response to incoming notifications are one of the most interesting uses of the smartwatch. Even though these uses do not involve any interaction with the watch or the incoming notification. This gives users the ability to quickly sate the curiosity caused by a notification on the mobile phone, potentially reducing the overall distraction from the incoming message as this can be done in around 3s.

This also gives the user the opportunity to effectively triage incoming messages, even when waiting for an important message which may have caused them to turn to their mobile phone with every audio or vibration alert they can continue the task at hand with very minimal interruptions until the channel or sender matches what they are waiting for. This could help with the types of on-going task where the distraction of mobile phone notifications was longer and more disruptive [31] when the user had to disengage with the primary task for a system generated or impersonal message, while allowing them to attend to the messages from family and friends that were not felt to be as disruptive during the task.

While there is some differentiation between the haptic alerts from, for example, a news application and an incoming text message this could be taken further to allow users to effectively pre-triage notifications on the wrist without interrupting the task at all. At the moment any notification that is ignored or postponed in this manner is put, unread, into the notifications drawer – necessitating carrying out the maintenance. By allowing users to dismiss notifications unseen, depending only on the haptic alert, through a gesture or a touch pattern the number of glances that need to be attended to and the number of maintenance tasks that need to be done by the user could be reduced significantly.

5.1 Maintenance

However, looking at Metrotra et al.’s finding that the distraction was felt to be greater depending on the timing within the on-going task [31], the inbuilt insistence that comes with the notification method with the Apple watch may make it more disruptive than it would appear at first glance. When a notification is new then glancing at it triggers an animation, and the direct access to the notification without the need for direct manipulation of the watch face. This also results in the watch marking that notification as read. If this is put off in order to focus on the task at hand then the notification is placed in the notification drawer unread – causing the ‘red dot’ to appear front and centre on the watch face.

This red dot is the trigger for many of the maintenance interactions, and from the interviews was seen to be something that was conflated with negative ideas of having ‘missed something’ or ‘being behind.’ Therefor, the question poses itself as to if users’ actions to avoid the red dot includes performing glances at times when, if they were responding to notifications on the mobile phone they would not.

The reasons for starting a maintenance task vary. One reported motivation was simply to ‘get rid of’ the red dot given that its appearance means that one or more unread messages exist it must be cleared to serve its purpose as an indicator of unread messages. This means that it can happen, as in clip two, when the user notices that notifications are waiting, or it can be initiated to pass the time when the user was not otherwise engaged. We have seen this sort of down-time interaction with the notification drawer in queues in shops, waiting for public transport, and waiting between tasks in the kitchen.

Another motivation for this maintenance, beyond being able to use the red dot as an indicator of a new missed notification, was to make the use of the notification drawer itself easier. Given the small size of the screen on the Apple watch only one notification is fully visible at a time in this view. This means that if the notifications in the drawer are allowed to build up the user may be forced to scroll passed a lot of messages to find the unread one. While it is possible to ‘force touch’ – or press hard on the screen – to dismiss all the messages in the notification tray, if a user wants to find the unread notification (as ones that are seen with a glance still make their way to the drawer, they are simply marked as read and don’t result in the display of the red dot) they must manually find it first.

At the moment there is a lack of third part apps running on the watch, but interaction in this space is one where the user has time and attention to spare. This would be a good candidate for interaction with an experience sampling application to elicit further information from participants.

5.2 The Watch in Conversation

The watch has a different place in conversation and co-located interaction than the smartphone. This is in part due to physicality of the watch. The device is designed to be orientated to a single user wearing it on their wrist, so turning one’s wrist to allow another party to see the screen causes the watch to detect that it is no longer the focus of attention of its wearer – and this causes the operating system to turn off the screen. As we have seen [7] the smartphone, and the information on screen, often orientated around in conversation however this is not readily available when using the smartwatch. As seen in the clip in Fig. 3 this doesn’t have to hinder conversational gambits which use the incoming messages by transforming them to reported speech, or as a topical resources. However, improving the gesture recognizer on the smartwatch to detect such an action, or providing a more direct method (such as holding a physical button or temporarily locking the screen on in much the same way as one would lock the rotation on a smartphone) would make the watch fit better with current practice.

Conversely the location of the smartwatch on the wrist, in the same location and mimicking the style and function of a traditional wristwatch, means that it comes with the baggage of long standing social conventions in speech and interaction which can be disrupted by the additional functions that technology has added. Where the mobile phone opened a new space in bodily interaction in social situations, the smart watch must contend with the meanings people would read into the same bodily interactions conducted with a traditional wristwatch.

When the smartwatch is set to provide only haptic feedback, inaudible to others, then the action of glancing to check an incoming notification can be read as the action of glancing to check the time – indicating impatience, boredom, or the intention to begin a closing sequence in the conversation. All but one of our participants reported turning off the audio alerts of incoming notifications in this way.

Connecting the new affordances of the smartwatch with the practices of the wristwatch is somewhat more difficult. The social signals of performing a time-check are still available, however most other interactions with the watch overlap this signal. One possibility suggested by a participant during their post trial interview would be to add a second screen on the inside of the wrist allowing interactions with the smart functions of the watch to be obviously different.

6 Conclusion

In this paper we have gained a greater understanding of where the smartwatch sits in the technological landscapes in which users interact with them. By focusing on glancing and maintenance behaviours, accounting for the majority of use, we are able to compare the use of the smartwatch to that of other readily available devices in the home and on the body.

By looking at the physicality of the smartwatch, and its place not only on the body but in social expectations, norms, and constraints we can see that when designing an augmented, or smart, version of something already woven into our lives the current practices must not only be understood, but taken into account. Simply labelling change as ‘disruption’ is not enough. In the case of the smartwatch, users and those that they interact with must learn and adapt to their new affordances and build social practice and norms around them. One barrier to adoption often overlooked is the cost early adopters pay in continually breaking social practice until such norms come into being. By carefully designing for this transition process, taking into account that which is being disrupted as well as where and for whom, a major barrier for adoption of new technology could be significantly lowered.