Keywords

1 Introduction

Digital tabletops have been shown to be very suitable tools for use in collaborative environments [4, 6]. Their form factor improves workspace awareness [6], and their multi-touch capabilities allow simultaneous interaction. Together, this increases parallelism, allows more democratized access, and leads to an increased collaborative performance, which produces better results [6, 11]. Nevertheless, it is rare to see tabletops embedded in real settings. This is due to a number of disadvantages: their high cost; their limited workspace dimensions, which can only accommodate a certain number of participants; and the fact that their form factor complicates mobility in a way that, nowadays, if a digital tabletop is available, it is fixed to a single location and users are forced to move to a specific place if they want to engage in a collaborative activity around it. Ideally, it would be desirable for users to be able to form groups in an improvised way, in virtually any place, and of any size, using the devices they have with them to dynamically create a tabletop-like collaborative space. This allows us to take advantage of the benefits of tabletops in terms of awareness and parallel interaction, but using a different approach to tabletop working based on handheld devices. Devices such as smartphones or tablets are becoming increasingly popular and will be in common use in the near future. The portability of these devices implies that it could be possible to build multi-display environments (MDE) to support co-located collaborative activities on a table-like setting by coordinating interaction across several devices put together on the same table. However, this scenario raises a new challenge. As Gutwin et al. [4] found out, in highly integrated collaborative scenarios, users often have the need to acquire elements that are out of reach. Therefore, since interaction with handhelds is usually carried out by touch contact, in a situation where several users are gathered around the same table, interference problems may arise related to simultaneous touch actions from different users on the same surface. The immediate solution comes by asking others to hand over the elements. However, this could interfere with the task another user is currently carrying out. Another possible solution is Around-Device Interactions (ADI) [8], where a user could interact with an out-of-reach tablet which is at the same time being manipulated via touch by another user, and without interfering with the latter’s actions. We explore the potential of fiducial frame markers as an enabler of ADI.

Our end future goal is to design MDE collaborative environments around a table like the one shown in Fig. 1, where users bring their own devices and where they may use cards or tangible elements with attached fiducial markers to share objects (e.g., documents, game elements) or trigger reactive behaviors in a target surface on the table. The built-in camera in each device is used for the marker detection; therefore no additional hardware infrastructure is required. In this kind of setting, several interactive surfaces can be placed on a table at different distances, rotations, etc. Thus, the entry point at which users approach each surface can dramatically change, and so can the conditions surrounding the interaction itself. As such, before designing any complex ADI for these settings, the fundamental issue that must be addressed is the acquisition of the marker [13] (i.e., the initial step at which the fiducial marker is placed in the camera’s field of view for it to be detected), and evaluating how usable this is under different conditions. The main aim of this paper then is to present an ADI technique that uses fiducial markers and to obtain the ground knowledge that will enable us to design a table-based MDE that uses it effectively. Specifically, we evaluate the usability of the acquisition phase of the proposed ADI through an in-lab study, focusing on the ergonomic conditions that are perceived as facilitators of this interaction.

Fig. 1.
figure 1

Hypothetical collaborative scenario with ADI using fiducial markers

2 Related Work

There are a number of different ADI techniques with varying degrees of flexibility and hardware complexity. For example, Hasan et al. [5] present AD-Binning, an ADI used to extend the workspace given by the small screen of a smartphone by including their surroundings. Users then interact with the smartphone by using a ring-shaped band on their fingers, and store/retrieve items to/from the extended space. Both the device and the ring are tracked by a complex hardware setup formed by eight external cameras. While this may be precise, it is not available to the common user and requires previous assembly and calibration. Avrahami et al. [1] also explore ADI by using cameras mounted on both sides of the tablets to track distinct objects and capture interactions in the space near the device. While this approach allows mobility and the possibility of forming groups of users virtually anywhere in an improvised way, it still requires a careful installation of external hardware. A simpler approach is by Kratz et al. [8], who attach external IR sensors on a smartphone screen, and allow ADI using hands. Their main purpose is to reduce the occlusion produced by touch contacts, which is a form of interference. Other works reduce the hardware complexity even further by making use of the integrated sensors in the tablets. Ketabdar et al. [7], for instance, exploit the magnetic (compass) sensor of the device, and interact around it using magnets. Unlike optical approaches, this solution is more robust to occlusion. They also can shape the magnets in different forms, though the system is not capable of differentiating between different magnets since they do not have an encoded ID. This prevents its use in applications with multiple collocated users.

Although the previous examples enable ADI by augmenting tablets with external sensors, they are designed to be used by a single user and on one device. Probably for this reason most of them [1, 7, 8] require the interactions to take place close to the device. In a table-based MDE, this restriction could require that users lean over the table or even move towards the target device, which could be cumbersome and cause interference with interaction on the other devices.

Finally, Nacenta et al. [10] compare several interaction techniques, including ADI, in tabletop groupware in terms of how they enhance coordination. They include aspects like lack of interference, ease of transference of elements, and easy access to out-of-reach zones as important factors. Our work complements this previous work by considering other forms of interaction based on fiducial markers and studies the ergonomic, visual feedback, and marker size factors.

3 User Evaluation

In its simplest form, the proposed interaction consists of taking a card with the marker and bringing it close to a selected tablet for recognition. To study the usability of this, we identified seven ergonomic factors that may have an effect on this initial interaction phase: the user posture (sitting vs. standing), the active hand (non-dominant vs. dominant), the marker size (small vs. big, as shown in Fig. 2), the tablet position with respect to the users (whether it is at their non-dominant vs. dominant side), the tablet distance to the users (near vs. far, depending on whether it is within arm’s reach or not), the presence of a visual feedback on the tablet that allows the users to see what the camera sees (missing vs. present), and the users’ gender (male vs. female). In presenting the results (see Fig. 3), the first condition listed is represented as “level -” and the second as “level +”, e.g., sitting is the “-” level and standing is the “+” level for the user posture factor.

Fig. 2.
figure 2

Cards with the fiducial frame markers used in the experiment

3.1 Apparatus

The experiment was conducted with a 120 × 80 cm Table 75 cm high in an environment with ambient light intensity ~150-200 lx, and without any direct light source above the table. The surface’s table was visually divided across both horizontal and vertical axes, resulting in four identical rectangular sectors to account for near/far tablet distances and non-dominant/dominant tablet positions. When required, the users sat on a chair 47 cm high. The marker detection application was running on a Samsung Galaxy Note 10.1 tablet using the computer vision algorithms provided by VuforiaTM for Android devices. Visual feedback about the position of the marker with respect to the camera was given by video on the display. The size of this video region was 1/9 the screen size, and it was located in the lower-left corner. The size of the big and small markers was 50 × 50 mm and 17 × 17 mm respectively, and both were printed on a cardboard card of dimensions 63 × 91 mm and situated near the top (see Fig. 2).

3.2 Participants

Thirty-two volunteers, 16 male and 16 female, participated in this study. Ages ranged from 24 to 48 (M = 32.28, SD = 6.3). The average height of males was 180.88 cm (SD = 7.27), and females’ was 167.38 cm (SD = 7.14). Four of them were left-handed and the rest, right-handed.

3.3 Design

Since the number of factors considered is high and it would be impractical to have each user perform every combination of their levels, the experiment follows a mixed fractional factorial design \( 2_{IV}^{7 - 3} \), for which only 16 different treatments are needed; and, since gender is a between-subjects factor, each user only had to conduct 8 treatments. A total of 1536 interactions were performed across factors and treatments. To avoid order and carryover effects during the performance of the 8 treatments by a given subject, the order in which the treatments were presented follows an 8 × 8 balanced Latin squares design for each gender.

3.4 Procedure

Firstly, the users were given some time to train with the system in order to minimize posterior learning effects. During this training, they practiced the experimental task which consisted of taking the card with the marker and bringing it closer to the tablet’s camera for it to be recognized. Once they felt familiar with the interaction, the proper experiment began, where one user at a time had to repeat the previous interaction but following the instructions given by the different factor treatments (i.e., being seated or standing, holding the marker with their dominant or non-dominant hand, having the tablet near or far, etc.). Each treatment was repeated six times, and subjects were encouraged to perform this interaction as quickly as possible. For each treatment, the average elapsed time to the detection of the marker was measured, and the users filled a NASA-RTLX questionnaire [3] to assess subjective task load. Once all treatments were complete, a System Usability Scale (SUS) questionnaire [2] was administered to evaluate the usability of the technique. Users were also asked about their experiences in a short post-task interview.

3.5 Results

Regarding quantitative differences, Fig. 3 depicts the mean completion times by each factor. A repeated measures ANOVA (with an α = 0.05) revealed only two factors having a significant effect on the response variable: the marker size (F1,243 = 19.514, p < 0.001) and the visual feedback (F1,243 = 11.555, p = 0.001). No double or triple interactions were found significant. The shorter times correspond to performing the interaction with the big marker and having visual feedback, respectively.

NASA-RTLX scores for workload were analyzed using Friedman’s χ2 tests for ranks (df = 1). Subjects reported relatively low levels of mental, physical, and temporal demand for all conditions, as well as low levels of effort and frustration. All of the above were rated between 20 and 30 approx., with 0 meaning “very low”, and 100 “very high”. However, some aspects show significant differences between the levels of factors in some conditions. Participants reported a significant lower degree of workload in general using the big marker and having the tablet near. Concretely, these two conditions received significantly (p < 0.05) lower scores of mental/temporal demand, effort, and frustration. Subjects also made less effort (p = 0.034) when the visual feedback via video was shown. No significant differences were found between postures, but subjects perceived using their dominant hand as less physical (p = 0.009) and time demanding (p = 0.009). As for differences between gender, women reported significantly (p < 0.05) lower levels of workload in general than men. Concretely, they reported lower levels of mental demand, effort, and frustration, and they showed more confidence regarding performance than men, whose scores were more neutral.

The perceived general usability of the ADI technique was studied via a SUS questionnaire. The analysis using Mann-Whitney U tests revealed no significant differences between genders (U = 90.5, p = 0.156). The SUS total score (calculated as it is explained in [2]) is, on average, 73.13 (SD = 11.46) for men, and 75.63 (SD = 20.18) for women, which, according to Sauro’s guidelines [12] is above average (68). In Sauro’s letter-grade system (from A + to F), the overall usability of our technique would receive a B.

Fig. 3.
figure 3

Mean completion times (in milliseconds) for each considered factor

3.6 Discussion

The results show that the proposed interaction technique is usable in general (obtaining a B grade from the SUS questionnaires). This was also confirmed by some subjects’ comments regarding the good performance of the system and their suspicions about whether this was a real working prototype.

Further, the quantitative analysis has not found any statistically significant effects related to the posture of the users (sitting vs. standing), the side at which the device is located (dominant vs. non-dominant) and its distance from the user (near vs. far). This proves that the technique is usable under a wide spectrum of ergonomic situations. Moreover, the fact that subjects reported low levels of mental, physical, and temporal demand for all conditions means that the analyzed interaction is intuitive and potentially applicable to more demanding scenarios with subjects having cognitive or motor disabilities. However, the analysis of the qualitative results shows some implications for future applications using the interaction technique considered in this work. Several participants reported having difficulties with the visibility of the video feedback when they were seated and the tablet was not located within arm’s reach. This issue disappeared when they were standing up. Hence, this suggests that activities designed for situations where the users are seated should either keep the devices within arm’s reach for all participants or avoid video feedback.

Subjects also reported using their dominant hand was less physically and time demanding. However, this did not have a significant impact on the time to perform the gesture and some participants even commented they preferred to use one hand or the other depending on which side the tablet was on. This observation provides initial clues about the feasible use of both hands to allow bi-manual aerial interactions in the design of future ADI systems. Visual feedback and marker size had a greater impact on the interaction. The small marker was harder to recognize by the application, and the users needed to put it very close to the camera, which made them “lean too much” on the table when the tablet was distant. In a real application, this could cause interference with others’ actions and lead to much more frustration. Therefore, the results suggest that big markers are more suited for this sort of interaction. They can be attached to cards and, because the proper marker is a frame, they can be filled with application-specific content. Taking advantage of markers’ IDs, a potential use for these cards could be using them as information containers and also for the transfer of elements/documents in a co-located group, which many subjects of our experiment showed great excitement about. According to Nacenta et al. [10], an interaction like this one, using the card as an information container, presents intrinsic benefits in terms of awareness since the system does not need to present any action points to the user (e.g., via a cursor), and because any user can easily identify which colleague performed a given gesture with the card because of the visibility of such actions. On the other hand, having visual feedback in a small region of the screen was perceived as a very useful feature for allowing users to adjust and correct their actions. Similar benefits from visual feedback are reported by Nacenta et al. [10]. Nevertheless, some subjects pointed out that it might not be convenient to integrate this feature, since it would reduce the display work area.

4 Limitations and Future Work

It is important to note that, as Nacenta et al. [10] remark, the choice of a given interaction technique affects coordination, preference, and performance. Hence, the results reported in this paper are only conclusive for this particular interaction. There are also several limitations to our work. Firstly, the experiments were performed with the same tablet brand and model, and the resolution of the camera could have an effect on the recognition. Secondly, the study only refers to the initial acquisition phase of the interaction. However, analyzing the feasibility of this initial phase and obtaining first user impressions is a necessary step forward before designing and considering more complex scenarios based on ADI. Regarding this, the results obtained provide useful information for future designers of ADI-based environments using frame markers. As a next step, we plan to perform a full study of different types of ADI marker-based manipulations to obtain a full set of feasible interactions with this technique. Another limitation is that the controlled interaction was performed in isolation by a single user and, therefore, no interference issues that could happen in a collaborative scenario were evaluated. This is certainly an interesting area of future research to evaluate the full potential of the proposed technology to support MDE collaboration.

Despite these limitations, the relatively good usability results obtained encourage us to delve into the ADI proposed in this paper, and conduct further experiments that test its suitability in actual collaborative environments. Regarding future uses, we consider this interaction to be promising in meeting environments where users could easily exchange documents attached to marker cards. Another potential application domain is gaming, where, for instance, the tablets could be used in consonance with a physical gaming board offering digital augmentation, whereas the cards could encode abilities or objects to be transferred to the other participants. Physical games could also be implemented in which participants would exercise by moving around a big table with their cards obtaining and depositing items from the tablets scattered on it. This ADI could also be interesting in rehabilitation tasks for people with acquired brain injuries. In this case, markers could be attached to tangible elements that represent objects in real life and the patients should have to reach the tablet (among several others) that shows some digital content related to the element they are holding.