1 Introduction

Touchscreens are becoming increasingly prevalent across a wide range of devices, including smartphones, computers, vehicles, kiosks, printers, and home appliances. Although these interfaces are intuitive and easy to use, the absence of tactile feedback presents critical accessibility concerns for the 285 million world-wide visually impaired people [1]. In addition, the location of interaction controls on touchscreens varies, which makes it impossible for blind users to learn or memorize how to navigate on touchscreens. With the exponential growth of the mobile devices market, touchscreen based interaction has quickly penetrated areas from education, government, to enterprises. This could significantly reduce the degree of independence and equal opportunities for the visually impaired community.

For accessibility improvement on touchscreens, current Human-Computer Interaction (HCI) research focuses on gesture designs and vibration or auditory feedback. However, the nature of the touchscreen technology may advance in the very near future with physical cues and pressure sensors. Tactus [2] provides a tactile solution by enabling application-controlled transparent physical buttons that dynamically rise up from the touch surface on demand. Neonode [3] introduces Z-force to allow accurate detection of different pressure on the touchscreen. Unfortunately, little is known about how this change would impact on blind users’ touchscreen experiences.

This paper discusses findings from usability evaluations of simple accessibility solutions on touchscreens, including the simulated “touch and press” interaction design that is not yet available on commercial products. The gesture designs selected in this investigation met the following criteria: (1) one-finger gesture (to enable thumb operation with one hand); (2) no prior knowledge or training is required; and (3) supporting common interaction needs such as navigation, selection, and text entry.

Data from this study reported that “touch and press” had 0 % error rate and was faster than the other gestures. This result encourages further investment on technological evolution to make touchscreen more inclusive, rather than focusing on complex gesture designs. In addition, participants’ feedback and behavior patterns in the study confirmed a number of design guidelines, some of which were known to the accessibility community.

2 Related Work

A large body of research work has been carried out in the last decade to improve the accessibility of touch interfaces. These innovative techniques can be categorized in two areas: (1) gesture based interaction for input or selection, (2) screen overlays with tactile feedback such as texture or vibration, and auditory feedback such as earcons or speech. These approaches alleviate the accessibility concern, but not without problems.

VoiceOver is an accessibility feature introduced on iPhone since 2009. Users can receive auditory feedback by gliding finger over the screen content, and use a split-tap (i.e., touching an item with one finger while tapping the screen with another finger) to confirm selection. A single-finger swipe will move the readout focus to the next item, which matches blind users’ mental model of navigation. But as Vidal and Lefevre report in their comparison study [4], users find it is difficult to discover and accurately reproduce certain gestures with VoiceOver.

Several Braille-based typing techniques were adopted on touchscreens to support eyes-free data entry (e.g., BrailleTouch [5], PerkInput [6], and TypeInBraille [7]). Although these methods have significantly improved typing efficiency and accuracy, they have limited application as fewer than 10 percent of the legally blind people in the United States can use Braille [8]. Meanwhile, it is difficult to generalize these accessible text entry methods for information browsing tasks on touchscreens.

To evaluate the effectiveness of gesture designs, Oliveira et al. [9] compared four methods: QWERTY, MultiTap, NavTouch, and BrailleType. They conclude that blind users with low spatial skills are unlikely to perform well with QWERTY and MultiTap. But NavTouch and MultiTap are more demanding regarding memory and attention. Multi-touch interaction such as split-tap is ineffective for users with low pressure sensitivity. Multi-touch gestures were not chosen in this investigation because they require two-hand operations for mobile users.

Various haptic overlay solutions have also been explored to compensate the absence of tactile feedback on touchscreens. Touchplates [10] uses inexpensive plastic guides on the touchscreen and can be recognized by the underlying software application. This approach is low-cost and easy to adapt, but it is unlikely that users will carry a set of static Touchplates for real-world uses.

MudPad [11] uses electromagnets combined with an overlay of magnetorheological fluid to create instant multi-point perceivable feedback, from different levels of surface softness to dynamically changeable textures. Similarly, Thermal signs [12] use Peltier micropumps as temperature dots to develop simple graphics to introduce dynamic tactile feedback to touchscreen interaction. Usability evaluation is not yet available for these methods. Further investigations are needed to examine their effectiveness, efficiency, and development cost.

Compared to texture or temperature based touch feedback, physical buttons provide more intuitive and effective tactile cues. To create physical buttons that can be dynamically controlled, Harrison and Hudson propose a technique [13] to raise, recess, or remove pre-defined buttons on touch interfaces. Following this approach, Tactus Tactile layer [2] uses dynamic microfluidic pressure to raise buttons on demand. Research in these areas has shed light to a promising future, where physical buttons on touchscreens can be shaped and located on the fly.

In summary, gesture interaction allows fast but less accurate input. It brings new challenges to the blind community as 82 % of the blind people are 50 years and older [1] and many have cognitive or motor disabilities [14]. It is difficult for them to remember, distinguish, and accurately reproduce the required gestures like sighted users [15]. Tactile feedback helps. However, the static or predefined physical overlays suffer from their inflexibility. Assistive reading software offers auditory feedback about the visual contents, but a great deal of navigational information is often lost in the readout.

3 Experiment Design

The main purpose of this study was to understand the key attributes that make accessibility solutions effective, efficient, and easy to use. It aimed to answer the following questions: (1) What are the needs and concerns blind users have regarding the assistive technologies available in their daily lives? (2) How do gesture designs affect blind users’ task performance and satisfaction on touch interfaces? (3) How to improve auditory feedback in assistive solutions? A three-phase research study was carried out to answer these questions:

Phase I. One-on-one interviews with visually impaired participants to identify their needs and concerns with touch based interactions.

Phase II. Examination of five methods of gesture-based touchscreen interaction via blind participants’ performance and perception.

Phase III. Prototypes were developed based on the findings from Phase II with additional auditory feedback and confirmation of selection, to evaluate perceived effectiveness and efficiency from blind participants.

Twelve participants were recruited in this three-phase study (see Table 1).

Table 1. Participants information

In Phase I, one-on-one semi-structured interviews were carried out with each participant to identify their needs and concerns of touch based interactions. All interviews focused on understanding how their everyday activities were supported (or limited) by technologies. The interview covered questions such as what devices they had required touchscreen interaction, what assistive tools or technologies were available to them, what they needed or concerned about accessibility improvements on these devices.

In Phase II, five interaction methods (see Table 2) were selected to investigate users’ task performance (speed and accuracy) and satisfaction.

Table 2. Gestures examined in phase II

In this within-subject experiment, each participant completed 15 tasks, 3 tasks for each gesture method. For each task, they were asked to start from a pre-defined screen position (one of the 16 circled spots in Fig. 1) to navigate to a pre-defined target (one of the 9 buttons in Fig. 1) as quickly and accurately as possible. Task assignments were randomized to reduce learning effect.

Fig. 1.
figure 1

Illustration of prototype used in phase II

All tasks were performed on a 15” touchscreen interface. Performing a defined gesture would start the voice readout immediately. The current readout would stop as soon as a new readout was initiated. All voice readout indicated what the current target was and how to select it. E.g., “Tap to select [target]”, “Press down to select [target]”, etc. Participants’ preferences on readout speed were measured, but voice readout was played at 156 words/minute (wpm) in this study for consistency.

Because pressure-sensing was not available on the touchscreen used in this study, the “press down” gesture was simulated via a Wizard of Oz approach – the moderator activated the selection when the participant performed an explicit press-down gesture.

The five methods were evaluated via the following dependent variables: (1) Task completion time, measured as the time elapses from navigation-starts to target-selected; (2) Error rate, measured as the number of incorrect selections divided by the number of tasks; (3) Perceived ease of use, (4) Perceived learnability, and (5) Perceived satisfaction were subjective ratings measured on a 7-point Likert Scale.

In Phase III, the investigation focused on the effectiveness of auditory feedback. A prototype was developed to simulate the printer interaction experience because: (1) it offers a range of task complexity levels; and (2) no participant had seen or used this interface previously. The gesture design chosen for this prototype was based on the testing results from Phase II.

The evaluation in Phase III included four tasks (as shown in Fig. 2): Copy, Fax, Search, and Email. All tasks started from the Home screen. Participants were asked to follow the voice feedback to complete each task. The blue arrows in Fig. 2 mark the correct routes to complete each task. The incorrect navigation paths were also available in this prototype, with corresponding voice feedback, as marked with black arrows in Fig. 2. Each task was coupled with one of the four prototypes (Table 3), with a balanced Latin Square task order to avoid learning effects Fig. 3.

Fig. 2.
figure 2

Illustration of prototype used in phase III

Table 3. Voice feedback used in phase III
Fig. 3.
figure 3

Task completion time in phase III (in seconds)

Seven variables were measured: (1) task completion time, a 7-point Likert Scale was used to collect participants’ pre-task expectation and post-task perception ratings on (2) ease of use; (3) error likelihood; (3) effectiveness and (4) efficiency of speech prompt, (5) effectiveness and (6) efficiency of confirmation, and (7) satisfaction.

4 Results

4.1 Phase I – Interview

The outcome of the interviews confirmed the increasing accessibility challenges introduced by touchscreen interaction. The following accessibility improvements were highly demanded by the 12 blind participants:

  • Equal opportunities to access information and technologies as sighted people. Equal opportunity to education and employment has always been a major concern in the visually-impaired community. Many felt the situation has been worsened by the recent technological advancement that required visually-demanding interaction.

  • To use mainstream devices via effective yet inexpensive assistive technologies. Although specialized devices or applications are more user-friendly, they are often much more expensive. Some mentioned that accessibility features such as Voice-Over were also needed by sighted users when driving. Accessibility solutions should be available on mass-market devices.

  • Adjustable speed for screen read-out to optimize efficiency. For blind users, the auditory output is their main channel to receive information. They have often developed abilities to comprehend speech information at a much faster speed. Data from Phase II experiment shows that participants’ preferred speech rate was 256 wpm (varied from 187 to 421wpm). Being able to personalize speech rate would most likely improve the efficiency of their information processing.

  • Auditory feedback on the touch interface of home appliances and office devices. Participants emphasized their needs for independence and privacy. One totally blind participant asked her sighted friend to place Braille tags on the flat touch panels of all her home appliances, so that she could use them without additional help. Inclusive designs are in demand for office devices (e.g., printers) as well to allow blind user to live and work independently.

  • Simple and intuitive touch gestures that are easy to discover, remember, and use. Similar to the findings reported by Kane et al. [15], participants suggested that gestures designed for blind users should (1) rely on physical orientation references such as screen edges and corners, (2) reduce requirements on accuracy, speed, and complexity. Successful gesture designs should work for blind users with a wide range of age difference, cognitive and motor capabilities, and educational levels.

4.2 Phase II – Gesture Design

Quantitative data collected in this research were analyzed with One-way Analysis of Variance (ANOVA). Findings are discussed in the following sections

4.2.1 Task Completion Time

Gestures had a significant impact on completion time (F4,175 = 9.60, p < .001), see Table 4. However, the speed difference amongst the touch-for-feedback gestures (touch-press, touch-lift, touch-tap, and touch-double tap) was not significant (F3,140 = 0.19, p = .902). The main contributors to task completion time are (1) navigation gestures (touch vs. tap), where Meantouch = 27.56 s, Meantap = 71.27 s (F1,178 = 38.63, p < .001); and (2) selection gestures (non-double-tap vs. double-tap), where Meannon-d-tap = 27.46 s, Meand-tap = 49.58 s (F1,178 = 13.09, p <=.001).

Table 4. Task completion time (in seconds)

Participants’ vision status also affected their task efficiency. In general, legally blind participants were able to complete tasks faster than totally blind participants: MeanLegalBlind = 20.61 s, MeanTotalBlind = 52 s (F1,178 = 29.86, p < .001). A two-way ANOVA reports a signification interaction between Gesture and Vision (F4,170 = 4.88, p = .001).

4.2.2 Error Rates

The performance data showed that error rate was significantly affected by different gestures (F4,175 = 6.21, p < .001), see Table 5. Error rate with tap-for-feedback was about 6 times higher than touch-for-feedback: MeanTouch = 6.94 %, MeanTap = 50.00 % (F1,178 = 22.46, p < .001). Double tapping for selection also had much higher error rate than other selection gestures: Meannon-d-tap = 6.48 %, Meand-tap = 29.17 % (F1,178 = 8.71, p = .004). Error rate was not significantly affected by Vision (F1,1718 = 2.10, p = .149).

Table 5. Error rate

4.2.3 Subjective Ratings

A 7-point Likert Scale was used to collect perception ratings (1-lowest, 7-highest) and participants were also asked to rank their preference of the 5 methods (1-lowest, 5-highest) after the completion of all tasks. Significant difference was found in all subjective ratings (see Table 6), except perceived learnability (F4,55 = 1.62, p = .182).

Table 6. Subjective ratings and overall ranking

Gestures had an impact on perceived Ease of use (F4,55 = 5.02, p = .002). Specifically, the ratings were heavily influenced by whether one could touch or tap for speech feedback: Meantouch = 5.9, Meantap = 3.8 (F1,58 = 20.02, p < .001). Totally blind participants had lower ratings, MeanLegalBlind = 6.1, MeanTotalBlind = 4.9 (F1,58 = 8.72, p = .005), whereas younger participants seemed to have higher expectation, thus lower ratings, on Ease of Use, Mean18-34yr = 4.0, Mean35-54yr = 6.3, Mean55-74yr = 5.8 (F2,57 = 11.94, p < .001).

Although participants commented that all five methods were fairly easy to learn, a closer look revealed that the perceived Learnability was impacted by age: Mean18-34yr = 5.0, Mean35-54yr = 7.0, Mean55-74yr = 6.4 (F2,57 = 20.09, p < .001), and vision: MeanLegalBlind = 6.6, MeanTotalBlind = 5.9 (F1,58 = 4.4, p = .040). Being able to touch for feedback also made the gesture easier to learn, Meantouch = 6.4, Meantap = 5.5 (F1,58 = 6.37, p = .014).

Similar to the ratings on Ease of Use, Satisfaction ratings were impacted by gestures (F4,55 = 2.80, p = .035), vision (F1,58 = 8.71, p = .005), age groups (F2,57 = 10.41, p < .001), and if participants used touch for speech feedback (F1,58 = 11.60, p = .001).

The overall ranking on Method 5 (tap for feedback, double-tap for selection) was significantly lower than the other four methods (F4,55 = 3.41, p = .015), as participants were frustrated by the tap-for-feedback gesture (F1,58 = 11.89, p = .001).

4.2.4 Discussion

In debriefing, participants explained why tap-for-feedback was particularly difficult for blind users: (a) it had no point of reference – for totally blind users, tapping to find target was like “taking a stab in the dark”; (b) it was very easy to miss the target – sometimes they tapped on a target but moved away too quickly and missed the voice feedback; (c) continuous tapping on the same target was registered as a double-tap, which resulted in a selection error rather than voice feedback; and (d) several participants were able to find the target quickly, but their selection was slightly off the target, and had to spend more time to re-find and select the target. In addition, individuals had various speeds of double-tap, which made it difficult for the system to distinguish a slow double-tap from two quick single-taps.

Data from the Phase II experiment indicate:

Touch-for-feedback allowed users to find a target significantly faster and more accurately than tap-for-feedback. This gesture would have worked more effectively if there had been tactile difference between target areas vs. non-target areas.

The selection gestures examined in this study had their pros and cons:

  • Despite of users’ familiarity, tapping or double tapping to select a target resulted in great frustration and higher error rate as they required the user to tap accurately on the target once it was found. Split-tapping may alleviate this concern, but it is not applicable for one-handed thumb operation.

  • Lift-to-select was easy to use and learn. But users could easily make mistakes if they accidentally lifted finger off the wrong target. For typically compact mobile layout design on touchscreens, using this gesture can be stressful if the user must keep her finger on the screen to avoid unintentional selections.

  • Press down-to-select was the fastest and most accurate method. But most participants were not yet customized to this new gesture. Because the prototype did no offer tactile feedback, a few participants were uncertain how hard they had to press. Participants’ preferred force for Touch was 0.45 newton, and 5.50 newton for Press, on average. Strength of this gesture will likely be fueled by emerging technologies [2, 13] that offer physical feedback on touchscreens instantly and dynamically.

    Auditory feedback must be prompt in assistive designs for blind users. When users are relying on the auditory information to “see” the touch interface, any delay in feedback will lead them to either wander away or perform their selection again, which may result in errors.

4.3 Phase III – Auditory Feedback Design

In Phase II, Method 1 (Touch-Press) outperformed others gesture designs in efficiency and accuracy, therefore it was chosen for the prototype used in Phase III.

4.3.1 Performance

Unsurprisingly, participants’ task completion time was significantly impacted by the nature of the tasks (F3,44 = 4.37, p = .009). But the auditory feedback in this study did not influence the performance as expected: whether gesture was indicated in prompts (F1,46 = 0.60, p = .443), or whether speech confirmation was used (F1,46 = 0.32, p = .573).

4.3.2 Perception

Participants’ subjective ratings are summarized in Fig. 4.

Fig. 4.
figure 4

Subjective ratings in phase III (on a 7-point likert scale)

The perception ratings were not statistically significant across the four treatments, but it is interesting to see that after trying out the prototypes, participants did not think mentioning gesture in the prompt was as helpful as they thought. In contrast, having speech confirmation was perceived more helpful to avoid errors, as earcons could be misunderstood. Overall, participants were able to use audio cues to complete the tasks, but the interaction experience was not as satisfactory as they had expected.

4.3.3 Discussion

The tasks in Phase III were selected based on their complexity levels (Level 1: easiest, Level 4: hardest), which was predefined by the steps required by each task:

  • Level 1. From Home to Copy screen, change copy number to 3, click on “Copy It”.

  • Level 2. From Home to Fax screen, enter the 10-digit number, click on “Fax It”.

  • Level 3. From Home to Search screen, enter the search query, click on “Search”.

  • Level 4. From Home to Email screen, go to Recipient screen to enter the email address, return to the Email screen, click on “Send”.

However, most participants were able to completed Fax faster than Copy. Because they were familiar with the standard numpad layout on the Fax screen, many were able to quickly enter any number once number “5” was located. The layout of the Copy screen was foreign to all participants and it took them a while to find where to enter the copy number, which is in the middle of the bottom of the screen. This finding indicates that familiarity with the screen layout critically affects blind users’ task performance. Participants’ behavior also suggested that the key action controls should be located near physical reference points such any corner of the touchscreen.

For tasks of Search and Email, participants did not think the auditory feedback was helpful when navigating a soft QWERTY keyboard. The challenges include:

  • It was difficult to distinguish letters that are phonetically similar and located closely on the keyboard (e.g., C, V, and B). Context was needed such as “B as in Bob.”

  • Error correction was painful. For example, to correct “vill report”, they had to erase all entries rather than to highlight and change “v” to “b”. Immediate confirmation of what was entered reduces efforts on correction.

  • All participants understood the layout of a QWERTY keyboard, but motor memory of typing with both hands did not help when typing with one finger.

In summary, for text entries on touchscreens, blind users would expect more sophisticated methods such as speech recognition or attached keyboard.

It is interesting that legally blind participants in the study were able to complete tasks faster when gesture was not mentioned (MeanSayGesture = 320 s, MeanNoGesture = 241 s, F1,22 = .95, p = .339) and when earcon was used (MeanSpeech = 326 s, MeanEarcon = 236 s, F1,22 = 1.25, p = .276). All participants believed that if the gesture designs were consistent on a device, the gesture indication would only be necessary initially, or in a tutorial. But split opinions were reported about the use of earcons: some liked them because them were efficient and easy to understand, others had concerns because (1) “it only makes sense to tech savvy users”; (2) users with hearing disability may have difficulty hearing the “click”; (3) it was less effective in situations where explicit confirmation was desired (e.g., “Email was sent successfully.”).

5 Conclusions

This paper has reported findings from a three-phase research study that aims to investigate the needs and concerns of accessibility improvement, as well as how gesture designs and auditory feedback designs affect blind users’ touchscreen experience. This investigation has confirmed the following guidelines to improve accessible designs on touchscreens:

  1. 1.

    Avoid gestures that require precision (e.g., “tap on the target”). Otherwise, allow error tolerance by making the touch area for selection larger than the touch area for audio cues.

  2. 2.

    For soft numpad or keyboard designs, use the standard layout and button size similar to the physical numpad or keyboard to match blind users’ motor memory.

  3. 3.

    Use consistent gesture designs and offer tutorials for users to quickly familiarize themselves with the accessibility modes.

  4. 4.

    Use gesture design and auditory feedback to support navigation and selection, but automatically switch to speech recognition or other specialized typing methods for text entry.

  5. 5.

    Enable separate accessibility modes for low-vision vs. no-vision users. Totally blind users rely on prompt auditory feedback of the screen target they are touching. Visually impaired users could better benefit from features such as adjustable color contrast and zooming.

  6. 6.

    Provide adjustable speed and volume of speech output to satisfy individual needs.

  7. 7.

    Place critical interface elements (e.g., Home, Back, etc.) near physical reference points such as the screen corners or edges for easier orientation and faster access.

One limitation of this research is that all gestures required on-target selection, which resulted in poor performance and high frustration. Future experiments are under development to compare performance and perception amongst Touch-and-press, Touch and split-tap (as in VoiceOver on iPhone), and Touch and double-tap-anywhere (as in TalkBack on Android). It would be interesting to see if the inconvenience of multi-touch gestures is an acceptable tradeoff for speed and accuracy.

Whereas many research studies focus on gesture designs for accessible solutions on touchscreens, this paper reports positive findings on Touch-and-press interaction. Results of this research encourage further investment on the evolution of touchscreen technology. By satisfying user needs with new technological enablement, we can deliver intuitive and inclusive designs and offer equal opportunities to people with visual disabilities.