1 Introduction

In the digital age, people are continually subjected to visual information through many demanding tasks. As human mental resources and attention spans are limited and need to be marshaled efficiently [1, 2], the input interface should facilitate users to reduce their visual attention. However, with a flat and featureless touchscreen, the mobile interfaces do not offer many cues for an efficient and safe interaction but require users’ visual attention in order to locate and interact with the correct spatial position. This means that even for short operations, the touch input still calls for users to look at the screen and inhibit attention from any parallel or primary task. The nature of touchscreens is at odds with the way in which humans intuitively navigate the physical world. By simplifying and optimizing menu layout patterns and understanding how we can locate and memorize active touchpoints [3], there is an opportunity to create touchscreen interfaces with viable eyes-free interaction. Spatial memory and proprioception are innate cognitive abilities to recall the spatial layout and the sense of limb awareness so as to control a hand and fingers aiming at the target in the absence of vision intuitively. The body-spatial proprioceptive awareness of the touchscreen activities will be investigated. Other than spatial information of interface layouts summarized in a cognitive map (memory), spatial feedback received from the internal sense (human skin and subcutaneous tissues) and external environment (frame of reference) is associated to take precise action on a touchscreen [4,5,6,7,8].

The absence of any physical buttons (tactile cues) on the touchscreen implies that eyes-free interaction is not easily achieved. Some researchers applied reactive audio feedback to facilitate the touch input technique for eyes-free menu selection [9, 10]. Nevertheless, the problem might occur if there is background noise interference. It seems that the interaction with a non-spatial interface such as pre-defined shortcut gestures might lessen the need for visual attention [11]. However, these abstract gestures are hidden controls, requiring a significant effort to remember and learn. The imprecise gesture recognition of drawn abstract shapes on a touchscreen causes no response and makes users frustrated [12]. Recent studies are insufficient to establish mobile interface design guidelines for eyes-free interaction [13]. Many researchers and developers devote their attention to designing effective mobile interfaces [14,15,16]. The key size and location were considered to be a major interest in ergonomic design.

However, there are relatively few studies examining the design of a touchscreen configuration by leveraging spatial memory and proprioception for one-handed thumb input. Therefore, we investigate human cognition and task performance under the development of touchscreen interface configurations to increase the accessibility of mobile interfaces for eyes-free interaction.

In this paper, we explore interface configurations to give answers to the following questions: (1) How do they affect performance accuracy in eyes-free mobile thumb interaction? (2) What characteristics of the interface layout can facilitate eyes-free use? Overall, the study investigated interface design for dexterous eyes-free interaction on a mobile device without augmented feedback.

2 Related works

2.1 Human hand and control

This section provides an overview of human hand function and behavior, relating to human interaction on a mobile device. Hand manipulation and control could involve with or without visual feedback. The latter case would exploit spatial memory and proprioception, described in the next section. The thumb, especially in one-handed thumb use, will have a limited operational range due to holding the mobile. Bergstrom-Lehtovirta and Oulasvirta [17] proposed that the thumb functional area on a touchscreen in a one-handed posture can be predicted by inputting parameters that describe the device dimension, grip, and hand size. This area is restricted within a parabolic curve whose range depends on the distance of the index fingertip from the device edge. Users need to orient and adapt their finger gestures to the touchscreen interface with a power of grip. Choi et al. [18] presented that the user’s grip position and thumb knuckle are usually at the lower part. Various gestures are caused by anatomical motions produced from joints [19]. For example, flexion-adduction or folding (Fig. 1, top right), extension-adduction, or swiping radially (Fig. 1, bottom right). The degrees of freedom (DoF) in hand joints influence the dexterity of finger movement and control. The thumb has 5 DoFs [20]: 1 DoF (flexion-extension) happens at the distal interphalangeal joint (DIP), 2DoFs are supported by the metacarpophalangeal joint (MCP) and 2 DoFs are supported by the radiocarpal joint (RC). Regarding the analysis of finger posture and movement, the flexor muscles of the hand and fingers are stronger than the extensor muscles [21]. The evidence from Li and Goitz [22] shows that the maximum amount of force generated in the thumb movement is ranked from flexion, abduction, adduction, and extension, respectively.

The flat pitch angles in the comfort zone and steep angles (higher pitch angle) in the non-comfort zones are the common finger gestures in the touchscreen area [23]. In addition, interaction approaches differ amongst the population depending on the hand and mobile size as well as the kind of task. Lee et al. [24] claimed that the diagonal direction between the upper right and the lower left side of the screen provides comfortable gestures for a right-handed person. As the MCP joint is anchored at the lower right corner, the thumb often points toward the upper left corner. Huang et al. [25] presented that tapping in eyes-free interaction is more comfortable than drawing stroke, and the decreased accuracy of touch results from increasing the number of buttons on the layout. Moreover, they claimed that the thumb in the inward movement brings about higher physical efforts than the outward movement. In this paper, we explore interface configurations that make it easy to perform an action with one’s (right) thumb and minimize the change of the stable hand grip under eyes-free interaction.

Fig. 1
figure 1

Hand anatomy and finger postures under single-handed thumb interaction

2.2 Human senses for eyes-free interaction on target selection tasks

Humans perceive information from internal and external stimuli with their senses. Then the information is processed, coded, and possibly stored in individual memory [26]. Memory ability is intertwined with attention. Directing visual attention to a spatial stimulus can lead to more rapid accurate discrimination of the information contained in that stimulus. Interactions with a physical device through cues such as edges and corners provide additional useful information. Considering feedback-based timing, many interactions occur under an open-loop feedback system without visual perception [8]. Due to the movement control center in the central nervous system and the effectors (fingers) communicating effectively, humans can trust their actions through sensory feedback and thus they can control limbs very skillfully [5]. Overall, manipulation in eyes-free mode results from the coordination of spatial memory and proprioceptive sense under feedforward control [8]. Spatial memory and muscle memory transferred with practice and experience can promote eyes-free interaction [27].

2.2.1 Spatial memory

Spatial memory is a human cognitive ability associated with structuring and remembering the configuration or geometric properties of objects such as size, shape, distance, and coordinate location in a cognitive map [28]. People memorize the representation of object positions that relates to a landmark. To specify the position of a target, a person needs an anchor point or frame of reference. There are two types of reference frames: egocentric and allocentric. Egocentric frames of reference specify the location and orientation of a target near a person or in the peripersonal space. Allocentric frames of reference use environment elements and features to specify the location and orientation (inter-object relation) [29].

Structured patterns provide a visual cue that facilitates recognition because a group created from nearby objects tends to be processed together. Visuospatial working memory seems to be recalled better for the representation of four objects [30]. The detection of symmetry is one human instinct. The stimuli that are symmetrical along the vertical axis provide a better recall than those in horizontal and diagonal symmetry [31]. Moreover, the mode of presentation is important to spatial working memory. Simultaneous presentation is more advantageous to human recall and recognition than serial presentation [30].

Gustafson et al. [32] found that spatial knowledge could be gained during regular use of the interface layout and transferred to the imaginary interface. In other words, a frequency of use leads to spatial learning and develops into spatial memory. Jetter et al. [33] found that spatial memory performance suits better for the touch input compared to the mouse input. It is presented that direct touch input facilitates the encoding of object locations in the users’ mental representation. This relates to proprioceptive cues and muscle feedback.

2.2.2 Proprioception

Proprioception is information of the body position sense and degree of muscle stretch from static and dynamic limbs, signaling internally from the muscles, tendons, and joints [34]. A person’s ability to know the location and orientation of the body parts is due to interactions of sensations in the body under the vestibular and kinesthetic systems [4]. To exemplify, the neural process integrating the sensory signal in the joint and muscle of the limb as well as the cutaneous/subcutaneous system sends information to the central nervous system; the motor neurons send neural impulses from the central nervous system to skeletal muscle fibers to control the movement. Thus, the central nervous system (CNS) acts as the “command center” of human behavior [5]. The coordination of these systems ascending to and descending from the brain provides basic movements and postural control. The sensory system offers the body’s spatial position information consistently to guide and control motor actions effectively.

The thumb and fingers have a threshold for two-point discrimination of about 5 mm [35]. This is better than other parts of the body. Acuity to discriminate spatial detail through the small receptor field starts at 0.5 mm and at 7 mm in the large receptor field [35]. Spatial acuity has been defined as the ability to judge a target’s position at its relative distance. Lin et al. [36] studied proprioception in point division and tapping tasks on the forearm. They revealed that the anchor points on the elbow and wrist offer good proprioceptive accuracy and the accuracy deteriorates with distance from the referred anchor point. Moreover, the accuracy rates differ among interaction techniques. The sliding-through method provides better accuracy rates compared with direct tapping.

The sense of space also depends on remembered information [37]. van Beers et al. [38] proposed that precision depends on the direction and varies among proprioception and vision. In brief, proprioception precision increases in the depth direction while the spatial memory ability increases in the horizontal direction related to an eye angle.

2.3 Touchscreen interaction and user interface design

Designing interfaces could effectively support human perception and good interaction [39]. The visual and physical features of the interface involve its form, number, and spatial configuration [40]. Few studies have been devoted to configuration design for eyes-free interaction [41]. The interface configuration or layout involves the target’s sizes, positions, and their relation. Fitt’s law explained an effect related to the distance and size of the targets to the movement time [42]. Small targets and long distances increase the difficulty of a task and the movement time. Human perception of the interface configuration causes underlying motor processes. Similarly, interface configuration contributes to performance accuracy. In addition to a target position and size, the input condition, for instance, one hand or two hands and with/without vision, influences the task performance. Perry and Hourcade [43] suggested that using the preferred hand provides better performance in response time and accuracy than the non-preferred hand. Gilliot et al. [44] studied an indirect pointing task using the index finger on the touchpad and found that task performance in the absence of vision deteriorates up to 20% with at least a 3-mm targeting error. They also suggested that the aspect ratio is another important factor in task performance, the visual display and input surface should consequently have a similar aspect ratio.

Wang and Ren [45] explained that an orientation vector consists of a direction and an angle from a point of reference. Finger orientation is a cue to further enrich the interaction on touch surfaces. They, therefore, proposed the exploitation of finger orientation (yaw angles) on a tabletop interface and presented a sector menu. In addition to the pointing task, eyes-free interaction involves gestures, including drawing marks. Rouduat et al. [41] claimed that gestures provide more accurate and quicker responses. A stroke gesture to the target provides better performance than a discrete touch. Moreover, they found that symmetry of the menu supports interface learning and effective finger interaction. Indeed, touchscreen interaction could be a spatial tapping or a gesture input. Interface configuration should facilitate user perception, cognition, and responding. For an imaginary eyes-free interface, users develop a mental representation from the interface which has been memorized and interact on a touchscreen without visual attention. A good interface layout design could support effective finger interaction.

3 Materials and methods

3.1 Interface prototypes

The relevant concepts and knowledge from a range of different contexts in related works have been reframed as appropriate to examine the role of interface configurations under eyes-free thumb interaction. Eyes-free interactions under this study rely solely on spatial memory and proprioception. Therefore, the interface layouts should have already been memorized before testing on a touchscreen. We developed experimental prototypes with four alignments of the interface pattern and two levels of button proximity shown indirectly to participants on a remote display. These interfaces consist of horizontal (H), vertical right (VR), diagonal (D), and curved (C) layouts in an equal distribution and divided pattern (Div). The divided layout consists of a middle reference point and a pair of buttons on each side of this point. We also add 2 unstructured layouts for comparison with the structured patterns. Buttons in the unstructured pattern (Un-1 and Un-2) are arranged in a random manner. Finally, we designed the test of line drawings consisting of the V-Line, H-Line, D-Line, and C-Line layouts as well. There are 4 buttons tested in a touch layout and 3 or 4 lines in a drawing layout. The buttons were the same size but had different spacing sizes according to their layout pattern. Button and line layouts with the 16:9 aspect ratio, the most common one on the smartphone market [44] are shown in Figs. 2 and 3, respectively. The dimensions of all interface prototypes are available at https://github.com/munyaporn/Online-Resource.git.

Fig. 2
figure 2

Interface prototypes for tapping tasks

Fig. 3
figure 3

Interface prototypes for drawing tasks

In the absence of vision, touch positions are cued by the thumb and tactile perception of physical objects. During the test, participants cannot monitor their actions visually. Performance accuracy of interface prototypes caused by spatial memory and proprioception has been measured under the control condition. We analyze outcomes with the mean distance error. The higher the mean distance error, the poorer the task performance.

3.2 Design of experiment

The explorative study consists of two experiments to examine the role of interface configuration on spatial memory and proprioception. The serial presentation of the four-framed layouts was adopted in Experiment 1 to test the participants on which layout or target position could be stored and retrieved spatial positions better, whereas the single layout presentation was applied to Experiment 2 in order to test the performance accuracy for each layout directly. In addition, the unstructured layouts would be compared between both experiments to test the participants on which presentation mode could be stored and retrieved spatial positions better. Each experiment had 15 trials and took around an hour from start to finish, including short feedback questions about the interface preference. To release muscle memory and mental fatigue, Experiment 2 was conducted separately, usually around a day later, depending on the availability of the participants.

A repeated-measures within-subjects design was used. The independent variables were interface configurations and modes of presentation. The measurements were made on a single identifiable population (dependent samples). Therefore, all the participants took part in both of the experiments. Those who were touchscreen mobile phone enthusiasts were invited via university channels and social media.

3.2.1 Apparatus and protocol

The experiments were conducted online via participants’ mobile together with a Zoom meeting on a desktop display. Participants were requested to complete the tapping and drawing tasks in portrait mode using the thumb of their dominant hand interacting with a bespoke screen-recording app on their mobile screen (see Fig. 4). The main page on the mobile shows the list of steps in the experiments. After participants enter each step, the screen is changed to a canvas view (Fig. 4b) with a sound stimulus calling the target names every 5 s in a random sequence. Each number in the sequences is repeated 3 times with different previous numbers. The experimenter asked participants to listen to the audio stimulus and interact with the corresponding position without looking at the touch screen. They can look elsewhere except on their mobile. The screen-recording app was a responsive web application constructed on an HTML canvas at the front to detect touch or drawing actions from users, with JavaScript sending the action record to the backend. The recorded data were stored in a database for further analysis. Moreover, another web page was designed for the experimenter to control the tests (Fig. 4c). This page could load and visualize the recorded data in real-time. In the screen-recording app, the timestamp and position on the screen where the participant touched the screen were logged in the form of time series and (x, y) coordinates. The screen width-height of the participant’s mobile was also recorded.

Fig. 4
figure 4

Experimental apparatus

The protocol started with the experimenter introducing the study details to the participants via the presentation slide in the Zoom meeting, followed by a short interview regarding their handedness and experience with mobile phones. After that, a check was carried out for the audio stimulus on the mobile and the synchronization of the data on the controller web page. The participants were required to perform a short practice before entering the actual test session, and the graphic interface was controlled via a remote slide presentation. During the learning and practice phase, a picture of the layouts was presented (Fig. 4d) and would be visible to the participants for about 50 s, then disappeared from the screen. After that, the experimenter moved the desktop screen to the controller web page while the participants were tapping on the menu of the screen-recording app to initiate the audio command on the test. During the individual test, participants were required to interact with the touchscreen in an eye-free manner. The experimenter manually glanced at the participant’s behavior via video camera in a Zoom meeting to control eye-free interaction and simultaneously monitored the recorded data through the controller web page on the experimenter screen. If participants did not follow the instructions or the recorded data errored, that trial was to start again.

3.2.2 Data processing and analysis

The difference between the touch position and the target center position is the displacement error which will be calculated as the Euclidean distance (the shortest distance to the target position) through the Pythagorean theorem. The absolute (unsigned) value is then used to compare the level of distance error (d). The outliers whose trial positions are away from the centroid greater than three standard deviations will be removed from the data analysis.

According to the proportion-based grid under the responsive web design, the same button in different mobile sizes has a different distance (mm) from the screen edge. Thus, all the touch coordinates recorded from each participant’s mobile screen dimensions were transformed into a common coordinate system for making the data comparable among the participants. The mean distance error in relative units was used with respect to the square interface area of 90 units in length (the interface occupying the lower screen area only). It is interesting to note that the aspect ratio of the screen would not affect the outcome data as the interface or interaction area would not occupy the whole screen height. Therefore, the screen height relating to the aspect ratio does not matter.

As the coordinates of a line are a series of points, certain data of each line was used instead to evaluate the response outcome of the line drawing layouts. To exemplify, the average of y-coordinates in a horizontal line is the outcome data of the H-Line layout. The average of x-coordinates in a vertical line is the outcome data of the V-Line layout. The angle was measured for the outcome data of the D-Line layout. It has a range of 90\(^{\circ }\) around the bottom right corner, so the angle acuity can be used compared to the mean distance error under a range of around 90 units of interface size as well. Lastly, the y-coordinate of a curved line intersecting at axis x = 70 is the outcome data of the C-Line layout. From these approaches, the mean error of the line drawing can be calculated.

Descriptive statistics such as mean and standard deviation were used to examine the central tendency and variability of the measured variables. A paired t-test and one-factor repeated-measures ANOVA in Minitab 19.0 were used for hypothesis testing analysis for the normal distribution. The inference statistics were computed at the 95% confidence level.

3.3 Participants

There were 22 right-handed participants (12 female, 10 male) who were voluntary to the study. They were between the ages of 26 and 42 (mean = 34, SD = 5.2) and speak a language whose written form goes from left to right. Moreover, those participants use different models of mobile phones with 10 different screen sizes from 4.7 to 6.7 inches. The screen width ranges from 67.0 to 78.1 mm. The screen height ranges from 138.1 to 165.4 mm. The average aspect ratio of the participants’ mobile screens was around 1:1.69, ranging from 1:1.43 to 1:1.90, which was equivalent to the aspect ratio of the imaginary layout interface (1: 1.78) in portrait mode. The average thumb length measured from thumb tip to knuckle (the MCP joint) was about 64.6 mm, ranging from 55 to 72.4 mm. The average ratio of thumb length to mobile width was around 0.88. All the participants who were familiar with their touchscreen mobile were used to single-handed interaction, and their experience with their current mobile was from 1 month to 5 years, around 1.5 years on average.

4 Experiment 1 (Serial presentation mode)

This experiment was designed to examine the effect of interface configurations in serial presentation mode on the performance accuracy in touchscreen eyes-free interaction. The independent variables for interface configurations were the pattern (normal and divided patterns), the structure (unstructured and structured layouts), the alignment (horizontal, vertical, diagonal, and curved layouts), and the button positions. The hypotheses were formulated and examined in a controlled experiment. The mean distance error is a dependent variable for hypothesis testing.

4.1 Task and procedure

Participants were faced with four separate frames (layouts) named from 1 to 4 (Fig. 5). There were 15 tests from 4 sets. Set 1 aimed to test for unstructured patterns (Un-1 and Un-2 layouts, as well as the mirror of these layouts that were the Un-1 M and Un-2 M layouts). The button positions in the four frames were from Button 1 to Button 4, respectively. In each test on Set 2 and Set 4, the four frames of the button layouts were sequentially presented from the horizontal, vertical, diagonal, and curved alignments. In addition, one button in all layouts in each test was marked to test the spatial memory retention among the four layouts (the same button for every layout). Thus, there are four tests to cover all button positions (P1–P4). In these tests, participants’ attention was divided and competed among layouts for retentive spatial memory. It was supposed that certain spatial layouts could be a powerful trigger for recalling spatial memory and the redundancy in structure patterns can be helpful to the participants for encoding and retrieving. In Set 3, line drawing layouts were tested. The layouts were sequentially presented from the diagonal, curved, vertical, and horizontal line patterns. This presentation was counterbalanced from the button layout to avoid sequence bias. As the diagonal and curved line layouts contain 3 lines, the V-Line and H-Line layouts were revised to remain 3 lines only for this experiment. Similarly, one line in all layouts in each test was marked to test the spatial memory retention among the four layouts (the same line for every layout). Thus, there are three tests to cover all line levels (L1–L3).

As each test consists of four separate frames with 3 repetitions, participants will hear the sound stimulus on their touchscreen total of 12 targets in a random sequence. The total number of responses will be 180 (15 \(\times\) 12) outcome targets per participant.

Fig. 5
figure 5

Samples of serial presentations mode

Participants must integrate all spatial elements from four layouts within mental imagery and interact with each position accurately on their mobile and must recognize that the audio stimulus in the task is calling the frame name. In drawing a line, participants need to recognize and perform both drawing patterns and spatial location. Thus, it was supposed that the error from spatial memory would emerge clearly. As we can separate mistakes of pattern from drawing action, the percentage of line pattern error was used in data analysis instead of the mean distance error.

4.2 Hypotheses

Four hypotheses were proposed to investigate as follows:

H1

With the difference in spacing pattern and exploitation of the middle reference cue, the button accuracy on divided pattern layouts is expected to have a different task performance from the normal layouts.

H2

The salience and the memory retention of vertically symmetrical layouts should be superior to other layouts. Consequently, the horizontal layouts are expected to provide better task performance.

H3

The structured patterns provide redundancy, facilitating recognition. Thus, the task performance on the structured patterns is expected to be better than the unstructured patterns.

H4

Simultaneous presentation of targets is more advantageous to spatial memory than serial presentation. Therefore, the performances of the unstructured patterns in simultaneous presentation mode or a unified layout are expected to be better than those in sequential presentation mode.

The fourth hypothesis would be examined along with the results in Experiment 2.

4.3 Results and analysis

4.3.1 Outcomes and task performance analysis

There are 3168 data points and 792 drawing lines collected from 22 participants \(\times\) 15 trials \(\times\) 12 targets but 66 outliers whose outcome position were greater than 3 standard deviations away from the centroid, were removed, leaving 3102 data points in the performance analysis. This data is normally distributed. The experimental outcomes are shown in Fig. 6 for the line drawing task and in Fig. 7 for the tapping task.

Fig. 6
figure 6

Line drawing outcomes on Set 3 from four layouts

Fig. 7
figure 7

Outcomes on Set 1, Set 2, and Set 4 under button integration from four frames into a unified layout

Table 1 Pattern drawing error in percentage

The line drawing outcomes show many wrong patterns being drawn, for example, drawing a horizontal line instead of a vertical line, drawing a curved line instead of a diagonal line, and drawing a horizontal line instead of a curved line. Table 1 shows the percentage of drawing error, calculated from the number of wrong patterns divided by the total number of lines drawn. It was found that the C-Line layout has the lowest number of pattern errors (6.1%) while the V-Line layout has the highest number of pattern errors (9.1%). Although the H-Line layout was the last pattern in the serial presentation, the correctness of the horizontal line drawing was still better than the vertical line drawing shown in the previous order. This implied that patterns in the horizontal direction related to an eye angle, are good for memory retention.

Task performance of unstructured layouts in Set 1 is presented in Table 2. The descriptive statistics showed the mean distance error for each test. The overall mean distance error of unstructured layouts was 19.18 units. Task performance of the structured layouts in Set 2 (normal layouts) and Set 4 (divided pattern layouts) is presented in Table 3. Looking closer at the button accuracy for the structured layouts, Button 1 for each test is from a horizontal layout, Button 2 is from a vertical layout, Button 3 is from a diagonal layout, and Button 4 is from a curved layout, so we presented the layout name in each row instead of the button name. The overall mean distance error was 16.89 units for the normal layouts and 15.86 units for the divided pattern layouts.

Table 2 Mean distance error (units) of the unstructured layouts under serial presentation mode
Table 3 Mean distance error (units) of the structured layouts in Set 2 and Set 4

The repeated measures ANOVA showed a significant main effect of the mean distance error among four tests (positions), in both the normal layouts (\({F_{3,21}}\) = 4.17, p = 0.01) and divided pattern layouts (\({F_{3,21}}\) = 4.79, p = 0.01). In the normal layouts, it was found that Position 4 provided the lowest mean distance error (14.41), followed by Position 3, 2, and 1, respectively. On the other hand, in the divided pattern layouts, Position 2 provided the lowest mean distance error (13.70), followed by Position 4, 3, and 1, respectively. Therefore, H1 is supported that the button accuracy on divided pattern layouts has a different task performance from the normal layout. The divided pattern offers better performance accuracy on the button near the middle position. In addition, it was found that the button accuracy decreased with the distance from the anchor point and reference frame.

The repeated measures ANOVA showed a significant main effect of the mean distance error among layouts in Set 2 for both each test (\({F_{3,21}}\) = 8.76, p = 0.00 for Test 1, \({F_{3,21}}\) = 4.04, p = 0.01 for Test 2, \({F_{3,21}}\) = 5.21, p = 0.00 for Test 3, and \({F_{3,21}}\) = 4.83, p = 0.00 for Test 4) and all tests, (\({F_{3,21}}\) = 12.50, p = 0.00). The overall mean distance error on positions for H layout (11.38) is substantially lower than VR layout (16.90), D layout (18.03), and C layout (20.85). Similarly, the repeated measures ANOVA showed a significant main effect of the mean distance error on layouts in Set 4 for Test 1 (\({F_{3,21}}\) = 7.25, p = 0.00), Test 2 (\({F_{3,21}}\) = 3.95, p = 0.01), and Test 3 (\({F_{3,21}}\) = 6.00, p = 0.00), and all tests, (\({F_{3,21}}\) = 9.99, p = 0.00). The overall mean distance error on positions for the H-Div layout (11.35) is substantially lower than the V-Div layout (17.75), the D-Div layout (17.50), and the C-Div layout (16.86). That is the horizontal layouts provided the lowest mean distance error. Thus, H2, where the horizontal layouts provided better task performance, was confirmed. The experimental outcomes on the line drawing layouts also supported that memory retention of horizontal patterns is good.

The repeated measures ANOVA was then performed among the mean distance error of the unstructured layouts in Set 1 and the structured layouts in Set 2 and Set 4. The result showed a significant main effect on mean distance error (\({F_{2,21}}\) = 3.38, p = 0.04), which confirmed H3 that task performance on the structured layouts was better than the unstructured layouts. That is the mean distance error for the unstructured layouts in Set 1 was substantially higher than the normal layouts and the divided pattern layouts.

4.3.2 Feedback about the interface preference

The interview on the layout feedback shows that 17 of 22 participants prefer the normal structure layouts (77.3%) to the divided pattern layouts. Then the layouts were ranked on the easiness of the task from 1 to 4. The participants tend to give the horizontal button layouts the highest score (3.41 and 3.00, for button and line layouts, respectively). The second rank of their preference is the vertical layout (3.23 and 2.77) followed by the curved layout (1.73 and 2.18) and diagonal layout (1.64 and 2.05).

5 Experiment 2 (Single layout presentation mode)

This study aims to examine the performance accuracy of different interface configurations in single layout presentation mode and to measure the spatial acuity levels between the buttons or lines for each layout (alignment), the interaction techniques (tapping and drawing), and the spacing patterns (normal and divided patterns). In this experiment, the left vertical layout (VL) where buttons are aligned at the same height as the vertical right layout was additionally proposed to compare the effect of the left and right sides. This layout was added to the previous 14 layouts in Figs. 2 and 3, resulting in a total of 15 test layouts.

5.1 Task and procedure

The interface configuration (layout) is one factor of interest. Participants were faced with each layout, which was presented one at a time on the desktop (See Fig. 4d). They were required to memorize and proportionally map the button position of the spatial interface to the touch position on their mobile. During the test, no visual interface was displayed and sets of the button/line names were spoken in the predefined sequence for each test under the app while the participants needed to respond to this audio stimulus by tapping or drawing on the mobile screen as accurately as possible. There are 13 layouts consisting of 4 targets and 2 layouts consisting of 3 targets. The total number of responses with 3 repetitions, will be 174 (13 \(\times\) 4 \(\times\) 3 and 2 \(\times\) 3 \(\times\) 3) outcome targets per participant.

5.2 Hypotheses

The following hypotheses were identified and tested.

H5

Interface configurations impact eyes-free performance, thus it is expected to have a significant difference in performance accuracy among layouts.

H6

As the positions near the anchor point and reference frame offer good proprioceptive accuracy, Button 4 of structure layouts is expected to provide better task performance than others that have a long distance from the anchor point and reference frame.

H7

As the line drawing layout and the button (tapping) layout require different interaction techniques, those layouts which have matched spatial positions are expected to provide a significant difference in performance accuracy.

H8

The divided pattern provides a middle anchor position for useful clues. Thus, the divided pattern layouts are expected to provide a different performance accuracy from the normal layouts.

5.3 Results and analysis

5.3.1 Outcomes and task performance analysis

There are 3,828 sets of data collected, but there are 46 outliers and 1 trial error removed, leaving 3781 data sets in this analysis. This data is normally distributed. The percentage of outliers in Experiment 2 is lower than in Experiment 1. The experimental outcomes and the task performance for the line drawing task are shown in Fig. 8 and Table 4.

Fig. 8
figure 8

Line drawing outcomes on each layout

Much information was gained from line drawing tests such as the direction of movement, consistency, and line length. It was found that most of the participants drew a diagonal line outward from the common point and drew upward for a vertical line layout while they swiped the horizontal lines and curved lines from left to right. The characteristics of lines are noticeably varied, which seem to depend on the drawing method. The outliers on the diagonal layout were found on the lines drawn from outside toward the common point while the outliers on the vertical layout were found on the lines drawn in a downward direction. The steady outcomes occur in the lines drawn outward from the base. This means that drawing starting from the common (anchor) point, or the handgrip position could probably be better for proprioception. Moreover, the line length varied among the participants. The horizontal lines seemed parallel with the screen frame orderly. The drawing outcomes on the C-Line layout looked like parabolic curves. It was found that the curve layouts, whose outcomes required swiping radially, relied heavily on physical human factors, bringing about a high outlier.

Table 4 Mean error (units) in line drawing layouts

The mean error was minimum for the V-Line layout (5.48), followed by the H-Line (8.22), D-Line (9.60), and C-Line (10.10) layouts. Then, the repeated measures ANOVA was performed to investigate the difference in the mean error among lines for each layout. There was a significant effect on the mean error of lines for the V-Line layout (\({F_{3,21}}\) = 3.46, p = 0.02) and the D-Line layout (\({F_{2,21}}\) = 9.36, p = 0.00). Line 4 on the V-Line layout (3.99) and Line 2 on the D-Line layout (7.32) provided the lowest mean error. However, the ANOVA did not show a significant effect on the mean error of lines for the H-Line layout and the C-Line layout.

Fig. 9
figure 9

Outcomes on button layouts

Figure 9 shows the experimental outcomes for tapping tasks. Touch positions for Button 1–4 are presented in blue, red, grey, and yellow, respectively. It was found that the unstructured layouts had highly dispersed outcomes. The straight alignment layouts provided narrow strip outcomes as opposed to the curved layouts. Task performance of the unstructured layouts is presented in Table 5.

Table 5 Mean distance error (units) of the unstructured layouts under single layout presentation mode

The paired samples t-test was performed on the mean distance error of the Un-1 and Un-2 layouts between the two experiments that had a different mode of presentation (\({t_{21}}\) = 3.05, p = 0.00). The result revealed that the mean distance error for the unstructured layouts in Experiment 2 was substantially lower than that in Experiment 1. In brief, the performance accuracy was improved with the single layout presentation mode. Thus, the result supported H4 that simultaneous presentation or a unified layout was more advantageous to spatial memory than serial presentation.

Task performance of the structured layouts and the button accuracy on the normal layouts and divided pattern layouts is presented in Tables 6 and 7. The repeated measures ANOVA showed a significant main effect of the mean distance error on alignments in the normal layouts (\({F_{4,21}}\) = 5.53, p = 0.001) and in the divided pattern layouts (\({F_{3,21}}\) = 9.95, p = 0.00). The horizontal, vertical, diagonal, and curved layouts reveal a significant difference in the mean distance error. For the normal layouts, the mean distance error was minimum on the H layout (7.60), followed by the VR, VL, D, and C layouts. However, performance accuracy on the VR and VL layouts was not significantly different. This might be because both layouts had the same alignment and vertical symmetry. For the divided pattern layouts, the mean distance error was minimum on the H-Div layout (7.56), followed by the D-Div, V-Div, and C-Div layouts. It is interesting to note that the performance accuracy of the D-Div layout was better than the V-Div layout. Thus, H5 that the button layouts that are in different alignments provide a significant difference in performance accuracy is confirmed.

Table 6 Mean distance error (units) of the normal layouts
Table 7 Mean distance error (units) of the divided pattern layouts

To investigate the effect of button positions, the repeated measures ANOVA was performed on the mean distance error among buttons for each layout. There was a significant main effect of the mean distance error for the H-Div layout (\({F_{3,21}}\) = 5.71, p = 0.00), the C-layout (\({F_{3,21}}\) = 4.27, p = 0.00), and the C-Div layout (\({F_{3,21}}\) = 6.56, p = 0.00). Button 4 whose position is near the anchor point and reference frame provided more accuracy. However, the ANOVA did not show a significant effect on the mean distance error of button positions for the H, VL, VR, V-Div, D, and D-Div layouts (p > 0.05). The performance accuracy was not substantially different among the four buttons on these layouts. Therefore, H6 regarding the effect of button positions is only partially confirmed.

To investigate the effect of the interaction technique, the paired samples t-test was performed on the mean distance error between two pairs of layouts that have equivalent spatial discrimination, i.e., the H versus V-Line layouts and the vertical button layouts vs the H-Line layout. The mean distance error on the V-Line layout was substantially lower than the mean distance error on the H layout (\({t_{21}}\) = 3.58, p = 0.00). However, no significant difference was found between the VL and H-Line layouts (\({t_{21}}\) = 1.32, p = 0.20), and between the VR and H-Line layouts (\({t_{21}}\) = 0.73, p = 0.47). Thus, H7 was only partially confirmed that the button layouts that were required tapping at matched spatial positions with the line drawing layout provided a difference in performance accuracy.

Finally, we performed a paired t-test to see the effect of spacing patterns on each alignment. No significant difference was found between the H and H-Div layouts (\({t_{21}}\) = 0.09, p = 0.93), between the VR and V-Div layouts (\({t_{21}}\) = 0.71, p = 0.48), between the D and D-Div layouts (\({t_{21}}\) = 1.14, p = 0.27), and between the C and C-Div layouts (\({t_{21}}\) = 1.60, p = 0.12). Thus, H8 was not supported that task performance from divided pattern layouts is different from the normal layout.

5.4 Feedback about the interface preference

Participants ranked the preferred layouts without acknowledgment of the performance accuracy. It was found that the popularity of the divided pattern layouts slightly increased by 4.6%. Participants who like the divided pattern articulated their reason that this pattern can provide additional reference positions for segmentation. However, most participants preferred the layouts in an equal distribution (72.7%). For the normal layouts, the top score is layout VR (3.73), followed by the H layout (3.59), the C layout (3.36), the VL layout (2.18), and the D layout (2.14). The score is dramatically different between the VR and VL layouts. Many participants claimed that buttons on the left side are hard to reach when using large mobiles. For the divided pattern layouts, the V and H layouts had the same preference level (2.82), being higher than the C layout (2.73), while the D layout (1.64) had the lowest score. For the line drawing, most participants prefer drawing on the D-Line layout (2.86), followed by the C-line layout (2.59), H-Line layout (2.32), and V-Line layout (2.23).

6 Discussion

The task performance could demonstrate the effect of interface configuration in eyes-free touchscreen interaction caused by spatial memory and proprioception. In this study, the visual interfaces were presented indirectly on the desktop display for participants to memorize before performing eyes-free interactions on the touchscreen mobile. All targets were presented within a frame similar to the touchscreen frame for reference. The results of the experiment showed that the participants were able to learn positions and spatial relations among buttons within a reference frame and to interact on a touchscreen with their short-term memory. They had limited time to construct mental imagery of spatial interface but could use this spatial understanding to map the position accurately. Being able to see the unified frame configuration, the participants performed the tapping task of unstructured layouts better in Experiment 2. The touch positions of the Un-1 and Un-2 layouts shown in simultaneous presentation mode are much more precise as opposed to those in Experiment 1. The results from this study were in line with Pieroni et al. [30]. In other words, interface layouts presented locations in a single frame led to better recall of spatial position. All targets should be presented within a unified reference frame as opposed to sequential or separate presentations.

As expected, performance accuracy was better for the structured layouts than the unstructured layouts. The structured pattern provided the salient feature of organization and distance relation, it, therefore, enhanced the spatial mapping process. Tversky [46] claimed that mental load was decreased with schematization because the relevant information was compressed and captured well. Thus, layouts with any nearby and related objects promoted the recall performance.

In Experiment 1, the participants were required to memorize both line patterns and spatial positions among four layouts that competed for memory. However, performance of the H-Line layout, presented in the final order, was still satisfying. The outcomes of button layout also showed the superior quality of horizontal alignment on spatial memory. The participants interacted with positions from the horizontal layouts effectively. The results were in line with the findings in the previous research that the vision ability increased in the horizontal direction, or spatial memory was better for the patterns that were symmetrical along the vertical axis [38, 47].

The alignment of the layout impacted eyes-free performance. The layouts in straight alignment provided better task performance than the curve alignment. The mean distance error was minimum on the horizontal layouts, followed by the VR, VL, and D-Div layouts. The curved layouts gave the poorest touch accuracy. It was found that button accuracy within the layout was similar, except for the H-Div, C, and C-Div layouts. Among the four buttons of the structure layouts, Button 4 which has the shortest distance from the anchor point or is located on the right side close to the palm seems to provide more accuracy. Moreover, the positions near the anchor point, and the reference frame provided tactile cues for effective orientation and offered good proprioceptive accuracy. Overall, the straight layouts with an equal button distribution made stable spatial discrimination performance.

Surprisingly, although the spaces between Button 1 and Button 2 and between Button 3 and Button 4 on the divided pattern layouts are smaller because of the existence of the middle reference button, performance accuracy on the divided pattern layout, containing five buttons is equivalent to performance on the normal four-buttons layout. It was supposed that the middle position was a useful clue for spatial discrimination. In other words, the divided pattern layouts containing five buttons could be provided useful clues from a middle anchor position. Furthermore, it was found that the accuracy of buttons near the middle position on divided pattern layouts was obviously improved when tested on each button among four layouts in Experiment 1. Therefore, middle segmentation strengthened spatial recognition and task performance.

Drawing a line seemed to provide better performance accuracy than the button layout as it offered navigation and adjustment during the drawing process. It was found that the V-Line layout with the same spatial discrimination as the H layout offered a lower mean distance error. However, there was a significant difference in performance accuracy between the vertical lines while there was no significant difference in performance accuracy between buttons in the H layout. In addition, drawing a line required a higher physical workload and completion time. For these reasons, adopting a drawing layout was suggested only for extra interaction vocabulary.

Post-test feedback from the participants on the interface preference was useful for interpreting the results. The participants preferred the normal layouts whose buttons are evenly spaced consistently rather than the divided pattern layouts. Most participants agreed that the horizontal button and the horizontal line layouts were the easiest patterns to remember. They also preferred the vertical layout on the right and the D-Line layout because these layouts required the natural thumb posture orientation from the anchor point and close to the palm.

7 Implications for design

All findings of the present study improved our understanding of innate human ability and insight for designing an effective eyes-free interface. Based on the results from two experiments, we gain insights into designing the eyes-free interface, we develop the design framework of interface configuration supporting non-visual touch screen interaction. The design framework consists of seven characteristics of interface configuration (See Fig. 10). The left four pillars involve the interface presentation that would promote spatial recognition and memory while the right three pillars involve the interaction area that would support proprioception. When developing an eyes-free touchscreen interface, designers should consider the following guidelines for enhancing performance accuracy caused by spatial memory and proprioception.

Fig. 10
figure 10

Design framework of eyes-free interface

  1. 1.

    Structure with evenly spaced buttons Design the structured patterns that align each button with even spacing. The structured layout provides redundant spatial cues, continuation, and regularity. Putting buttons into a straight line with an equal distribution pattern forms the effective structure layout, facilitating the spatial discrimination process for non-visual touch screen interaction.

  2. 2.

    Unified frame Put interface elements united in a single frame. Users tend to describe the relation of an object with respect to another object and a reference frame. Presentation of the related objects in a single frame facilitates perception. Users can integrate various spatial objects through the same view and frame of reference under the schema. Therefore, the mental image of an interface can be constructed effectively for eyes-free interaction.

  3. 3.

    Horizontal alignment Let the horizontal alignment be the first priority when designing eyes-free interfaces. This alignment provides better spatial memory because the person’s field of vision scans horizontally. Moreover, the horizontal alignment has a vertical symmetry and fully exploits the physical features of the device (the bottom base, the left and right sides of the screen). Thus, spatial discrimination on the horizontal layout provides good performance accuracy.

  4. 4.

    Middle segmentation Using the middle button or the halfway location in equal proportion for a reference to other buttons in the layout could facilitate the spatial discrimination process. Though the decision between the use of an odd/even number of buttons depends on the interaction area and a compromise between target size and spacing size, the number of targets for the odd numbers would be beneficial as the middle button from the odd button series could be used as an additional anchor point. Using the middle segmentation for a reference reduces the workload in eyes-free interaction. Thus, if applicable, middle segmentation should be adopted.

  5. 5.

    Proximity to device frame within comfortable thumb range Design the layouts that fully exploit the physical features of the device (edge, side, corner) because these features are stable, easily distinguished, and universal among users. The frame of reference is the vital cue to spatial memory and proprioception. The relative distances between objects in a fixed frame are logically and proportionally coded in a mental map. The greater the number of reference frames on the layout, the greater the precision. The more closely the button is aligned to the reference frame, the greater the performance accuracy. In addition, it is essential to put the interface in the comfortable thumb area because uncomfortable gestures reduce performance accuracy. Positions that are out of reach or too low require additional supportive micro-movements. The interface area should not exceed the thumb length and the comfort or natural thumb position.

  6. 6.

    Symmetry in a square Adopt the square area to configure the interface as the square offers the most lines of symmetry. Symmetry quality facilitates spatial memory in human visual perception. The horizontal, vertical, and diagonal layouts constitute the grid square symmetry. The vertical buttons that are put in identical spacing sizes with the horizontal alignment strengthen spatial memory and muscle memory. The diagonal alignment from the bottom right corner could exploit the detection of diagonal symmetry and give a suggestion to the middle position of layouts, used as the virtual reference axis for the spatial discrimination process. These simple and familiar relationships enhance human cognition, resulting in better proprioception.

  7. 7.

    Along thumb flexion direction Design the layouts such that they tighten the degree of thumb movement. Buttons in the area along the thumb axis can be more easily discriminated against under the hinge of the thumb. If the posture is stable and certain, spatial acuity will be improved properly. The distance discrimination from the hinge joint (flexion) gives a strong spatial precision as opposed to the angular discrimination from the knuckle joint. Thus, motion along the thumb flexion direction provides a better touch accuracy than lateral motion. The horizontal, vertical, and diagonal layouts from the bottom right corner in a square grid area are examples of one-dimensional alignment in the thumb flexion direction.

8 Conclusions

This study examined the role of interface configuration in eyes-free interaction with touchscreens. A total of 15 layouts with different presentation modes, alignments, spacing patterns, and interaction techniques were investigated in two experiments. The results showed that the alignment and mode of presentation affect visual perception and spatial integration. Spatial memory of interface layouts is evoked in the mind and works together in mutual harmony with proprioception. Thus, a person can recognize and harness the fixed spatial position of a target and reach it instinctively with less visual attention, enabling eyes-free touchscreen interaction. Interface configurations affect finger gestures and then performance accuracy. Under single-handed thumb interaction in portrait mode, the horizontal layout and the vertical line drawing layout offer the best performance, followed by the vertical, and diagonal layouts. The accuracy of the buttons within their own layouts was mostly similar. Furthermore, the horizontal and vertical right layouts as well as the diagonal line layout that required the natural thumb posture orientation from the common anchor point, close to the palm, gained popular acceptance for the layouts that are easy to interact with from participants’ feedback. Finally, the interface design frameworks were proposed for eyes-free touchscreen interaction. To enhance performance accuracy, caused by proprioception and spatial memory, the eyes-free interface should be configured in a structured pattern with evenly spaced buttons, presented in a unified frame, set with horizontal alignment, and allow middle segmentation. The interface elements should be positioned along the thumb flexion direction, in the area that provides symmetry in a square, and in proximity to the device frame within a comfortable thumb range.

With eyes-free interface layouts, users can input an interface control under a reduced level of visual attention. Our findings can inspire new applications such as the design of the shortcut menu layout on a mobile, and the interface layout for operation control on any touchscreen application. In the future, a touchscreen mobile could be used as a universal peripheral control device. While users are interacting indirectly on a touchscreen, they need not switch their attention.

In future research, we plan to investigate the minimum threshold or appropriate button size and spacing size for eyes-free interfaces as well as the practicalities of interface applications in various contexts (e.g. multitasking). It should also test the effect of preference learning style (the cognition and interaction mode) on task performance.