Keywords

1 Introduction

Haptic feedback is commonly used in mobile phones, but so far the feedback has required physical contact with a device or the use of wearable actuators. Ultrasound tactile actuation [7] is a new approach for providing tactile feedback. It removes the limitation of contact and creates true mid-air haptic sensations.

Parameters such as duration, rhythm, and intensity, can be used to encode information into tactile sensation. Yet, it is unclear how to combine these for easy identification of stimuli. The ability of transducer arrays to rapidly update focal point location to create movement and shapes opens up new possibilities as to how much and what kind of information can be encoded into haptic feedback. However, this brings with it new open questions about how to best design haptic cues to convey information.

To better understand the technical and human limitations of ultrasonic haptic information transfer, we conducted a study to evaluate six different haptic feedback stimuli in terms of how quickly and accurately they could be identified and how well they would work as mobile phone notifications (e.g., receiving a phone call or a notification).

2 Related Work

Mid-air haptics can be made with a number of technologies that allow for tactile feedback on touchless interaction. Tactile sensations can be rendered to interactive spaces in 3D. Technologies such as pressurized air jets [16] or air vortex rings [18] can provide strong but rough feedback with some inherent time lag. Lasers [8, 11] or electric arcs [17] are possible for very precise short range feedback.

Focused airborne acoustic air pressure produced by ultrasonic phased arrays [2, 6, 7] is particularly good at generating a range of tactile stimuli on the user’s palm or fingertips. This technology allows for fast rendering of single points or multiple simultaneous ones for patterns such as volumetric shapes [9].

These inaudible sound waves (typically 40 kHz [7] or 70 kHz [6]) can be focused into a single location in space. As the human hand can not feel vibrations at 40 kHz, the emitted ultrasound is modulated at the focal point to a frequency of around 200 Hz. This frequency is detectable by mechanoreceptors sensitive to vibration and pressure [5]. At a focal point, the acoustic pressure becomes strong enough to slightly indent the human skin and stimulate the Pacinian corpuscles, thus generating a touch sensation.

For the most common hardware types, rendering multiple focal points simultaneously makes each point feel weaker as there are less transducers used for each point. The temporal resolution of touch perception is only a few milliseconds [10]. Hence, fast moving single points, through spatiotemporal modulation [4], can be used to create the sensation of an entire shape being rendered at once.

Tactons [1], or tactile icons, have been used to communicate messages non-visually to users through ultrasonic actuation [3] and have been used with immaterial [15] and virtual screens [14].

Previous work on identifiability of mid-air haptic shapes [13] found that users are better at identifying shapes with a single focal point or shapes organized in straight lines compared to circular shapes. However, there has been little research into the range of parameters that can be used in ultrasound haptics to find out what makes the most effective tactile cues. Therefore, we designed a study to find out what parameters can make the stimuli more identifiable.

3 Methods

Sixteen voluntary participants (11 male), aged 20 to 42 years (median age 29 years, SD 6.07) with normal sense of touch by their own report, took part in the study. Seven of them had no previous experience using ultrasonic haptics and the rest had experienced it once or multiple times.

3.1 Pattern Design

Preliminary testing indicated that stimuli using variation in duration and rhythm resulted in more reliable identification of six different haptic stimuli than variations in shape. Pretesting also suggested that a 3 \(\times \) 3 cm stimulation area provided stronger and clearer stimulus perception than larger areas and was therefore selected for the experiment.

Table 1. Haptic stimuli parameters used in the experiment.
Fig. 1.
figure 1

Rendered shapes and repetitions. In stimuli 1 to 4, the focal point movement can be perceived, as shown by the arrow. Stimuli 5 and 6 use spatiotemporal modulation, making the entire shape concurrently perceivable.

Based on above the following six stimuli were designed. Each stimulus consisted of a rhythm formed by either one, two or three pulses, followed by a 300 ms break. Stimuli 1 to 3 had a 400 ms pulse duration, stimulus 4 had a single long pulse, and stimuli 5 and 6 had short, 100 ms pulses. Stimuli 5 and 6 used spatiotemporal modulation in which the focal point was rapidly and repeatedly updated to create a sensation of a single tactile shape.

Stimulus 4 had a circular shape, while the other stimuli had a square shape. It was also rendered in counterclockwise direction with the others rendered clockwise to see if the change in direction could be reliably noticed. Stimuli used in this experiment are described in Table 1 and visualized in Fig. 1.

3.2 Procedure

The UltraHaptics UHEV1 ultrasonic transducer array with 256 40 kHz transducers was used (see Fig. 1) to deliver haptic feedback.

Participants were to rest the wrist of their dominant hand comfortably on a foam pad with their palm facing down about 10 cm above the centre of the array. All participants used their right hand. They were instructed to keep their hand in the same position throughout the experiment. Near their left hand they had a keypad for input. No visual feedback was provided.

The experiment started with a written and verbal introduction and instructions, followed by a short training period, where the participants experienced each stimulus in a specific order (1 to 6), and repeated four times. They were instructed to try to remember the order number. In the experiment, the same stimuli were presented randomly, with each occurring three times (for a total 18 of stimuli). The participants wore headphones playing white noise to mask any audible noise from the array.

After each stimulus, a 500 ms audio sample was played as a cue for action. The participants were instructed to identify the stimulus by pressing a corresponding numeric key on the keypad in front of them as fast and accurately as possible.

Response times from the onset of the audio cue to the key press, as well as the key selected, were automatically logged in the database. Timing started after the stimulus had ended so that the stimulus duration would not affect the identification times. After an answer was given the experiment proceeded to the next stimulus and continued until all the stimuli were presented.

After the identification task, the participants were given the same stimuli again and they rated them on scales of valence (−4, unpleasant to 4, pleasant) and arousal (−4, calming to 4, arousing). They also rated the functionality of the stimuli as potential mobile phone notifications and incoming call notifications using a scale from 0 = does not suit at all to 7 = suits very well. At the end of the experiment, they were asked to freely describe their perceptions of the stimuli to see if the change in shape or drawing direction were detected.

The identification times were analyzed using one-way repeated measures analysis of variance (ANOVA). Bonferroni corrected t-tests at the .05 level were used for pairwise post hoc comparisons of the 15 pairs. The number of incorrect identifications, the valence and arousal ratings, and the stimulus suitability for mobile phone notifications ratings were first analyzed with Friedman tests and Bonferroni corrected Wilcoxon Signed Ranks tests were used for post hoc comparisons.

4 Results

One participant had trouble in sensing any of the stimuli. Therefore, this data was removed from the data set.

The ANOVA for the identification times showed a statistically significant effect of the haptic stimulus (F(5,75) = 5.824, p < 0.01). Post hoc tests showed that identification times were statistically significantly different between stimuli 1 and 3 (MD = 949.193, p = 0.019), 3 and 5 (MD = 1235.229, p = 0.003), and 3 and 6 (MD = 551.813, p = 0.029). Comparing just the duration, stimuli 2 and 3 were on average 500 ms and 700 ms quicker to identify than stimuli 5 and 6 respectively (See Fig. 2).

Fig. 2.
figure 2

Left: identification time distributions in ms from the start of the audio cue. Right: percentage of incorrect identifications for each stimulus.

The Friedman test for the number of incorrect identifications showed a statistically significant effect of the haptic stimulus \({X^{2}}=18.673\), p = 0.02. Post hoc pairwise comparisons were not statistically significant.

Figure 2 shows the percentage of incorrect identifications made for each stimulus. Accuracy rate across all stimuli was 66%.

Three participants were able to identify each stimulus without errors. Stimulus 5 saw an 118% greater error rate compared with stimulus 2 and stimulus 6 saw an 150% increase compared with stimulus 3, which were those with the same rhythm. Stimulus 3 was the quickest and most accurate to identify. There were large differences in accuracy across the tested stimuli. When drawing a square shape at a lower speed for one, two or three repetitions, the accuracy was 78%. With stimulus 3 the accuracy rose to 83%, with 11 participants not mistaking once in all of the 48 identifications.

The confusion matrix for the stimuli is shown in Fig. 3. Correct identifications are on the main diagonal and the rhythmically similar options have a solid border. 100% correct identification would give a score of 48. Stimulus 3 is the best performing cue, with 5 the worst.

Stimulus 4 was often mistaken as stimulus 1, but not the other way around. Stimuli 5 and 6 were as likely to be confused with each other, as they were with those with the same rhythm, stimuli 2 and 3.

Ratings of stimulus suitability for mobile phone notifications (Fig. 3) showed a significant effect \({X^{2}}=28.114\), p < 0.001. Pairwise comparisons showed significant difference between stimuli 3 and 5 (Z = 3.307, p < 0.01), and 3 and 6 (Z = 3.311, p < 0.01). Stimulus 6 seems well suited for a general notification.

Fig. 3.
figure 3

Left: confusion matrix for the stimuli. Stimuli (horizontal axis) and what they were identified as (vertical axis). Right: stimulus rating distribution on suitability for mobile phone notifications on a scale of 0 (does not suit at all) to 7 (suits very well).

The ratings of arousal and stimulus suitability for a notification of an incoming phone call (Fig. 3) showed a statistically significant effect of the haptic stimulus \({X^{2}}=13.577\), p = 0.019. and \({X^{2}}=14.071\), p = 0.015, respectively. Post hoc pairwise comparisons were not statistically significant.

No statistically significant effect of stimulus were found on valence (\({X^{2}}=1.484\), p = 0.915). Overall, participants regarded all stimuli as quite pleasant with an average rating of 1.41 (SD 1.83).

Participants’ descriptions of sensations failed to accurately identify differences in shape and direction, and the shapes were often incorrectly identified as being for example lines, crosses, vortexes or just random pokes. For the square shaped stimuli, there were 26 answers suggesting either a square shape, a line or a cross and 34 answers suggesting an oval, a vortex, a point, or a fluttering of points. For the circular shaped stimuli, there were 4 answers suggesting an oval, an infinity sign or a circle and 7 answers suggesting a linear movement or a fluttering of points. This suggests that shape and direction are not good options for cue design in this case.

5 Discussion

The results of the study showed that stimulus number 3 (See Table 1) was the fastest and most accurate to identify. It consisted of three 400 ms pulses forming a square pattern rendered at a slow rate. As the stimuli with lower rate were identified better than the ones with faster rate it seems that the duration had a greater effect than rhythm. However, as the experiment had multiple variables, isolation of the effect of one parameter is not viable.

The tested stimuli formed pairs of similarities. Stimuli 1 and 4 consisted of a single pulse, stimuli 2 and 5 had two pulses, and stimuli 3 and 6 had three pulses. In stimuli 1 to 3 the shape was drawn at a slower pace than in stimuli 5 and 6 so that the shape was completed in 400 ms.

The average identification time as well as the error rate of the stimuli 1 to 3 decreased as the number of pulses increased. A similar trend was seen for stimuli 5 and 6 so that adding the third pulse seemed to decrease the identification time and decrease the number of errors. Nevertheless, for the stimuli 4 to 6 there were considerably more errors than for the stimuli 1 to 3. Probably this is due to the very short duration of the pulse or due to the use of spatiotemporal modulation or both.

The results suggest that the longer duration of the stimulus led to a quicker and more accurate identification when the rhythm was two or three repetitions.

Palovuori et al. [12] reported that a 200 ms burst of ultrasound was an ‘unmistakable’ stimulus being functionally equivalent to a physical button click. However, in our experiment shortening the duration of the stimulus to 100 ms resulted in more incorrect identifications than prolonging the duration to 400 ms.

Stimulus 4, which had the longest duration to complete one shape had the highest number of errors and took a long time to identify. Increasing stimulus duration to 1100 ms did not make identification easier. However, it is unclear whether the shape drawn, or its drawing direction affected the identification. Considering that the participants couldn’t reliably identify any shape or direction, it is assumed that it had very little effect, but further study is required.

Rutten et al. reported 44% identification rate when the participants were provided with visual representations of the stimuli [13]. Our stimuli formed pairs of similarity, thus identification by chance is 50%. However, some stimuli were considerably more accurate to identify. Our results seem to indicate that identifying any ultrasonic haptic shape is difficult, but that stimuli can be identified with the right parameters for which rendering time is a key factor.

6 Conclusions and Future Work

The present work reported an experiment that compared six different ultrasound haptic stimuli to see how fast and accurately they could be identified. Statistically significant differences were found for identification times, number of incorrect identifications and subjective ratings. With a rhythm of three repetitions, a 400 ms pulse duration made the cue quicker and easier to identify than with a pulse duration of 100 ms.

Further, using short pulse duration, when the shape was drawn rapidly and repeatedly, the identification time and errors increased. This could be either due to the pulse duration or due to the use of spatiotemporal modulation.

Subtle changes in shape or direction of draw seem to go unnoticed by the users. Building a vocabulary based on direction or complex shapes is not a viable method of information encoding. These results may help in designing easy to identify ultrasonic stimuli.