Novel hands-free interaction techniques based on the software switch approach for computer access with head movements

Esiyok, Cagdas; Askin, Ayhan; Tosun, Aliye; Albayrak, Sahin

doi:10.1007/s10209-020-00748-1

Novel hands-free interaction techniques based on the software switch approach for computer access with head movements

Long Paper
Open access
Published: 08 July 2020

Volume 20, pages 617–631, (2021)
Cite this article

Download PDF

You have full access to this open access article

Universal Access in the Information Society Aims and scope Submit manuscript

Novel hands-free interaction techniques based on the software switch approach for computer access with head movements

Download PDF

Cagdas Esiyok ORCID: orcid.org/0000-0003-0229-0948¹,
Ayhan Askin²,
Aliye Tosun² &
…
Sahin Albayrak¹

1909 Accesses
1 Citation
Explore all metrics

Abstract

Head-operated computer accessibility tools (CATs) are useful solutions for the ones with complete head control; but when it comes to people with only reduced head control, computer access becomes a very challenging task since the users depend on a single head-gesture like a head nod or a head tilt to interact with a computer. It is obvious that any new interaction technique based on a single head-gesture will play an important role to develop better CATs to enhance the users’ self-sufficiency and the quality of life. Therefore, we proposed two novel interaction techniques namely HeadCam and HeadGyro within this study. In a nutshell, both interaction techniques are based on our software switch approach and can serve like traditional switches by recognizing head movements via a standard camera or a gyroscope sensor of a smartphone to translate them into virtual switch presses. A usability study with 36 participants (18 motor-impaired, 18 able-bodied) was also conducted to collect both objective and subjective evaluation data in this study. While HeadGyro software switch exhibited slightly higher performance than HeadCam for each objective evaluation metrics, HeadCam was rated better in subjective evaluation. All participants agreed that the proposed interaction techniques are promising solutions for computer access task.

Development and Evaluation of a Mouse Emulator Using Multi-modal Real-Time Head Tracking Systems with Facial Gesture Recognition as a Switching Mechanism

A Head Mouse Alternative Solution Proposal for People with Motor Impairments: Design and Usability Assessment Study

Evaluation of a Mobile Head-Tracker Interface for Accessibility

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

According to the World Report on Disability [1] in 2011, it is estimated that there have been about one billion people with several disabilities. Besides, about 2% of the world population—between 110 and 190 million people—have severe disabilities in functioning. People with motor-impairments—as a result of amyotrophic lateral sclerosis, carpal tunnel syndrome, spinal cord injury or degenerative diseases—require assistive technology solutions to have a more independent life. CATs are considered as one of the most efficient examples of these solutions enabling hands-free computer access. They are generally based on human–computer interaction (HCI) techniques where a mouse cursor is controlled by the user’s complete head control ability. But when it comes to people with only reduced head movement (i.e., the ones who cannot operate the mouse cursor by moving head), computer access becomes a very challenging task since the users have to interact with a computer by a single head-gesture like a head nod or a head tilt.

Computers have become indispensable tools with their immense services in our increasingly digitalized world. Unfortunately, most people with only minimal head movement lack these services, since they have difficulties to interact with their computers by means of current solutions. The World Report on Disability [1] also reveals that 80% of people with disabilities live in low- and middle-income countries, which means that the majority of people with only minimal head movements might not afford most hands-free HCI solutions [2,3,4], since they are generally depending on expensive devices. Although the aim of universal access is enabling equal opportunity and access to a service or product regardless of people’s physical disabilities by reducing barriers, the high-cost of most current solutions creates a new barrier financially for the majority of target group. On the other hand, according to the International Labour Organisation (ILO) statistics [5] published in 2007, an estimated 470 million of the world’s working age people live with several disabilities. Although there have been many jobs which are dependent only on computer usage like software coding, exclusion of millions of working age people with disabilities from the labor force leads to an increase in the Gross Domestic Product (GDP) lost worldwide. Furthermore, they lack a paid job which makes them feel more independent by affording themselves financially. It is obvious that any new HCI technique based on a single head-gesture will play an important role to develop better CATs to enable these people operate a computer for a more inclusive and barrier-free life.

In accordance with our efforts to find a solution for people with only reduced head control to interact with a computer by a single head-gesture, we began with a review of the current head-operated solutions in Related Works section. We noticed that the majority of interaction techniques requires a complete head control ability. In other words, there are limited solutions which are capable of supporting single head-gesture access for people with reduced head movements. As a result of our literature review, we identified two major problems of current head-operated interaction techniques with a single head-gesture access support: (1) requirement of dedicated devices, (2) compatibility with switch-accessible interfaces. To overcome these problems, we employed our software switch approach of which first examples were previously presented in Esiyok et al.’s study [6].

We proposed two novel interaction techniques namely HeadCam and HeadGyro by following the principles of the software switch approach. Both interaction techniques, major problems of the current solutions, and our software switch approach were explained in detail in Software Switches section. In a nutshell, both interaction techniques can serve like traditional switches by recognizing the head movements via a standard camera or a gyroscope sensor of a smartphone to translate them into virtual switch presses. Furthermore, they do not require a dedicated device and are compatible with most of switch-accessible interfaces. As low-cost alternatives, they can be replaced with expensive traditional head switches for computer access. They are also capable of recognizing any motion of the other body parts, such as the user’s shoulder or leg, which makes them quite flexible switches. By this way, different physical gestures can be targeted easily, when the user becomes tired. Besides, both proposed software switches do not require physical strength to be activated unlike physical switches; especially HeadGyro can even detect a minimal head movement to transform it into an emulated switch press.

A usability study with 36 participants (18 motor- impaired, 18 able-bodied) was conducted in order to collect objective and subjective evaluation data. The SITbench 1.0 [7] benchmark was employed for objective evaluation. Moreover, we also applied a System Usability Scale (SUS) [8] questionnaire for subjective evaluation. While HeadGyro showed slightly higher performance than HeadCam for each objective evaluation metrics, HeadCam was rated better than HeadGyro in subjective evaluation. All participants agreed that the idea of controlling a computer via a single head-gesture without requiring any dedicated device sounded very promising.

Given that the majority of the current solutions requires expensive dedicated devices, and that 80% of people with disabilities live in low- and middle-income countries [1], proposed software switches are expected to have a considerable impact. Currently, they are the only options for people with reduced head control (i.e., those who have to use a switch-based system for computer access) who cannot afford any dedicated device. On the other hand, considering there have been many jobs which are dependent only on computer usage like software coding, any tool for computer access undoubtedly helps these people to participate in the labor force, which will result in a decrease in the global GDP lost. Furthermore, the ones who can perform a paid-job will feel like they are more independent by affording themselves financially. Also, software switches can be employed as alternative inputs for multi-modal HCIs beyond assistive technology related purposes. Since HeadGyro software switch is not affected by external factors like light or wind, it could also be employed for outdoor activities (e.g., operating a wheelchair).

This paper proceeds with the Related Works section to summarize the current head-operated interaction techniques for computer access. In the Software Switches section, we identify the common problems of current interaction techniques and introduce our software switch approach with two software switches called HeadGyro and HeadCam proposed within this paper. Subsequently, we evaluate both interaction techniques by presenting objective and subjective evaluation results of our usability study in the Evaluation section. Finally, we conclude and discuss our study in the Conclusion and Discussion section.

2 Related works

In this section, from a broader perspective, we reviewed the current head-operated HCI solutions that provide alternative means for computer access. We preferred to separate them into two main groups according to the condition whether they have a single head-gesture access support.

2.1 Head-operated interaction techniques without a single head-gesture access support

Interaction techniques in this group require a complete head control ability for hands-free computer access. In principle, they translate the users’ head movements into mouse cursor movements in several ways:

One of the most popular techniques is wearing inertial sensors, such as a gyroscope or an accelerometer on the head (via a helmet or a cap) to control a mouse pointer [9,10,11,12,13,14,15,16,17,18,19]. These inertial sensor-based systems are mostly combined with a different sensor/switch to perform a mouse click task (e.g., in a way that head movements are detected by inertial sensors to control mouse pointer, and mouse clicks are performed by a puff switch). Another sensor-based solution called Headmaster Plus [20], which was evaluated in the work by LoPresti et al. [21], consists of ultrasonic sensors. Briefly, the user wears a headset including three ultrasonic sensors that wait an ultrasonic signal from a stationary transmitter on the user’s computer. In this way, ultrasonic sensors determine the orientation of the user’s head to convert them into mouse pointer coordinates.

Using a head pointer—a head-worn stick in principle—is another solution which permits the users to control, press or touch any target [22] by head, although this method is rarely preferred nowadays. Similarly, head-operated joysticks are alternative tools which enable the users to point mouse cursor on the screen [23].

On the other hand, a specific part of the user’s face (e.g., the tip of the nose) or the user’s whole head can be tracked by a standard camera in order to transform head movements into mouse cursor movements on a computer screen [24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44]. Mouse click tasks, such as left or right click, are generally performed with the dwelling method (i.e., the user holds the mouse cursor steady for a given amount of time to perform the click tasks) or with multi-modal approaches by means of other gestures like eye-blinks or tooth-clicks.

In addition to the above-mentioned approaches, head movements can also be followed by special camera-based systems to control a mouse cursor. In such systems, the user wears small reflective dots on his/her head/face or an infrared LED (light-emitting diode) which is placed on a helmet or a pair of glasses. These reflective dots are illuminated by an infrared or near infrared light source, and then a standard camera [45,46,47] or an infrared camera [48] tracks the position of target signals (coming from reflective dots or an infrared LED) for mouse cursor pointing. On the other hand, RGB-D cameras as new vision sensor technologies are also able to do 3D mapping of head position to control mouse pointer [49].

2.2 Head-operated interaction techniques with a single head-gesture access support

For those with only reduced head control, there have been limited solutions which are able to support single head-gesture access. Using a traditional button switch via a scanning interface is a common technique where a head switch is mounted close to the user’s head in a way that the user can hit it by tilting the head (or by any activity moving head) [50, 51]. In addition to traditional hardware switches, there are just a few software-based solutions [52, 53] demonstrating mouse click tasks with a single head-gesture. In software-based solutions, first the users are enabled to navigate the mouse cursor to the desired location by vision-based head tracking methods, and then mouse clicks are emulated according to the users’ head-gestures as an alternative to dwelling method.

3 Software switches

This section begins with a subsection which explains how we handle the detected problems of current interaction techniques by applying our software switch approach. Then, we introduce the common user interface of both software switch proposed. Afterward, HeadCam and HeadGyro software switches are also explained, respectively.

3.1 The software switch approach

Our software switch approach has two principles: an interaction technique based on our software switch approach (1) should not require any dedicated devices, and (2) should be configurable to be compatible with switch-accessible interfaces. By following these principles, we proposed two interaction techniques within this study. Detected major problems of the single head-gesture compatible interaction techniques (in Sect. 2.2) and proposed solutions based on our software switch approach are presented below:

1.
Requirement of Dedicated Devices: The majority of current solutions for computer access depend on dedicated devices which might be hard to afford for the ones living in low- and middle-income countries [2,3,4]. The high-cost of dedicated devices leads to a new financial barrier. Any new efficient solution based on an expensive device will not make any sense for these people, unless proposed solutions are affordable for them. Therefore, as the first principle of our software switch approach, interaction techniques for people with reduced head control should not require any dedicated device beyond standard computer peripherals like a microphone or a camera. At this point, as the only reasonable exception, we decided to exclude smartphones from the dedicated devices list, because the total number of smartphones—3.2 billions in 2019 [54]—got ahead of the total number of computers in recent years worldwide [55], which makes them easy to access for people in even low-income countries. Besides, smartphones are capable of providing several services to the users unlike dedicated devices which are produced with a specific aim. To sum up, while software-based solutions [52, 53] do not require any dedicated devices, traditional button switches are dedicated devices beyond standard computer peripherals. As low-cost solutions, HeadCam and HeadGyro software switches are based on a standard camera and a gyroscope sensor of a smartphone, respectively;
2.
Compatibility with Switch-accessible Interfaces: The majority of current solutions reported in literature are only compatible with a specific switch-accessible interface. To make it clear, first the mechanism of a scanning-based interface and standardization problem should be understood. In principle, unlike direct selection (such as typing on a keyboard), the scanning interface highlights items one-by-one on the computer screen, and the user activates the switch when the desired item is highlighted. Between switch-accessible interface and the switch, there is a switch adapter which is a dedicated device to transform switch activation signals into meaningful keyboard presses or mouse clicks. Following a switch activation, switch adapter emulates a specific keyboard character or a mouse click event (depending on the manufacturer of switch interface) and send it to the computer in order to communicate with switch-accessible interface. But the main problem in this case is that there has not been any commonly agreed standard for the communication between switches and switch-accessible interfaces; while some switch- accessible interfaces expect to receive a specific keyboard character like space, the others expect to receive a mouse click. This standardization problem is partially solved by a switch driver software permitting the users to assign a specific character or mouse click—following a switch activation—which is expected by the target switch-accessible interface. However, these switch driver software are only compatible with a limited number of switch adapters of specific brands, which makes them partial solutions for the standardization problem. In other words, each switch adapter requires its specific switch driver software. Although current software-based solutions [52, 53]—which are able to emulate mouse clicks—support single head-gesture and do not require any dedicated device, they are only compatible with specific switch-accessible interfaces which can be controlled with a mouse click as a switch input signal. To the best of our knowledge, there is not any complete solution for this standardization problem in literature. Both interaction techniques proposed within this study can be configurable to generate any expected keyboard characters or mouse clicks, which makes them compatible with most switch-accessible interfaces. They provide a better solution to the standardization problem than the current solution where a switch driver and a traditional switch are required to purchase. In other words, they are able to both detect a head-gesture like a traditional switch and allow the users to assign the expected keyboard characters or mouse clicks—which will be sent to the switch-accessible interface—like a switch driver.

3.2 The user interface

We designed an interface as shown in Fig. 1a which was employed for both software switches. Gamification techniques were applied to make software switches more engaging and fun. An initial state of the interface—where the user has a stable head position—can be seen in Fig. 1a. The interface includes three dynamic game elements: (1) the earth, (2) the left and (3) the right red border lines. All three elements can be controlled by the user’s head movements called pitch, yaw and roll (Fig. 1b). Sensitivity to control the game elements can be set according to the user’s head control capability. As the sensitivity level gets higher, the user can move the game elements with a slower and minor head movement. The mission of the game is to save the earth from the gravity of a black hole by moving these three game elements until the earth intersects with the red border lines. Switch press and switch release are emulated according to this intersection situation. In other words, as soon as the earth intersects with the red border lines, a switch press is emulated until the end of intersection, while a switch release is emulated once the intersection between the earth and the red border lines is terminated. The intersection (i.e., switch press) is followed by a visual or an auditory sensory feedback provided to the user. In order to calibrate the earth’s position, we simulated a gravity function that pulls the earth toward the black hole constantly. The gravity function becomes ineffective during intersection (i.e., switch press). Once the intersection is over (i.e., switch release), the gravity function is reactivated. In this way, if the user keeps his/her head stable for a while when there is not any intersection, the earth will be pulled to its initial position eventually by gravity (i.e., to the center). As illustrated in Fig. 2, each of six different head-gestures (i.e., rotational movements of the head) results in six different intersection states. While pitch (Fig. 2a) and yaw (Fig. 2b) movements control the earth’s position, roll movements (Fig. 2c) operate the position of the right and the left red border lines.

3.3 HeadCam

HeadCam is based on a real-time video motion tracking algorithm which is similar with the study by Esiyok et al. [6]. In principle, the user’s head is tracked by a built-in camera or a standard web-cam to translate the roll movements of the user’s head (as can be seen in Fig. 2c) captured by the camera into an emulated switch press. Before launching HeadCam application, in the configuration step, the user assigns the color of the tracked object through a RGB (red, green, blue) sphere with specified radius for Euclidean color filtering. The algorithm of HeadCam is listed step-by-step below:

Video frames are taken by a camera with a frame rate of 15 frames per second and a frame size of $320\times 240$ pixels (Fig. 3a);
Euclidean color filtering is applied for each video frames (Fig. 3b). By this way, Euclidean color filtering filters the colors outside of the RGB sphere with specified center and radius which are assigned at configuration step. In other words, it keeps the pixel within the specified color sphere and fills the other remaining pixels with the black color;
Following Euclidean color filtering, video frames are converted to gray-scale images (Fig. 3c);
All objects are detected in video frames through the Connected Component Labeling (CCL) method which groups together pixels belonging to the same connected component and treats them as separate objects. Following object detection, for each object, a rectangle is drawn according to the edge of the object (Fig. 3d);
The greatest object (i.e., the one whose rectangle has the largest area) is chosen if there is more than one object detected (Fig. 3e);
The center point of the rectangle of the greatest object is tracked in real-time on the frame (Fig. 3f);
Every motion of the greatest object (i.e., center point of the rectangle) is transformed into the motion of the right or left red border lines as it is depicted in Fig. 2c;
Once the earth intersects with the red border lines, a switch press is emulated.

An image processing library called AForge.NET was employed for filtering (Euclidean color filtering) and object detection (CCL). HeadCam is compatible with Windows-based operating systems and was developed under .NET 4.5 framework. Two roll movements of the user’s head (right and left head tilts) can be easily recognized by HeadCam, which makes our software switch capable of supporting double switch inputs for switch-accessible interfaces.

3.4 HeadGyro

HeadGyro interaction technique, basically, employs 3-axis gyroscope data of a smartphone—where the smartphone is placed on the user’s head—to convert the rotational movements of the user’s head into emulated switch presses. The smartphone can be placed on the user’s head in several ways. For example, the user can wear a cap which is attached to the smartphone or a modified belt holding the smartphone as can be seen in Fig. 4. The gyroscope is an important inertial sensor and mainly used to measure angular velocity of the sensor in inertial space. In other words, it measures the rate of change of the sensor’s orientation. Today, inertial sensors like gyroscope are based on microelectromechanical system (MEMS) technology. They are employed in modern smartphones frequently since they are small, cheap, light, and offer low power consumption. In spite of all these advantages, because of the electromagnetic interference and the influence of semiconductor thermal noise, MEMS based solutions might suffer from noise, which affects the accuracy of the detected angular velocity. We preferred the Kalman filter, which is a frequently used method in literature [56,57,58,59] for gyroscope data considering the real-time requirements, to avoid the noise. We also developed a mobile application depending on the Android operating system—which communicates with the computer in a wireless local area network (WLAN)—to convey the stream 3-axis gyroscope data to the computer. As can be seen in Fig. 1, roll, pitch, and yaw movements are represented by the angular velocity around each 3-axis of coordinate system as X, Y, and Z, respectively. The algorithm behind HeadGyro is briefly described step-by-step below:

Real-time angular velocity data originated from smartphone’s 3-axis gyroscope sensor is drawn by our Android application;
The Android application streams this gyroscope data, which holds three different angular velocity measurements from 3-axes (X, Y, Z), wirelessly to HeadGyro software switch running on computer;
For each channel (X, Y, Z), the Kalman filter is applied to reduce the noise as shown in Fig. 5;
Every motion of the user’s head is recognized according to filtered angular velocity measurements from 3-axes in HeadGyro, and these measurements are converted into the motion of the game elements as illustrated in Fig. 2. For example, if the angular velocity originated from z-axis is measured as a positive value, then the earth moves to the left side relatively; while it moves to the right side if the measured angular velocity value is negative;
Once the earth intersects with the red border lines, a switch press is emulated.

For Kalman fiter, we employed the MathNet.Filtering library. Like HeadCam software switch, HeadGyro is also compatible with Windows-based operating systems, and it was developed in the .NET 4.5 framework. It can provide up to six switch inputs for switch-accessible interfaces, since all six rotational head movements can be easily detected by HeadGyro.

4 Evaluation

A usability study was conducted to collect objective and subjective data. In this section, firstly we introduce the characteristics of participants. Then, we present the apparatus used within this study. Afterward, we briefly explain the SITbench 1.0 and the procedure applied during the evaluation of HeadCam and HeadGyro. At last, we conclude the section with our experimental findings.

4.1 Participants

Following the approval by the Ethics Committee of the Izmir Katip Celebi University (Turkey) on 10.10.2018 (decision number: 332), the usability study was conducted at Medical Faculty of the University (Turkey). All participants gave their informed consent before they participated in the study. Consent for publication of human images in this article was also received. A total of 36 participants, including 18 females and 18 males, took part in the evaluation of the proposed systems. While the disability group (DG) comprises 18 participants with motor-disabilities whose ages ranged between 18 and 68, the control group (CG) without disabilities includes 18 people (12 females, 6 males) whose ages ranged between 18 and 59.

In Table 1, age statistics of all participants are summarized according to groups. We also summarize the main characteristics of all participants in Table 2. As an inclusion criteria, all voluntary participants in DG had several difficulties controlling their heads and thus could not operate a computer with conventional ways (i.e., with a mouse and a keyboard). They were all under medical treatment for several motor disabilities, while the experiments were conducted. On the other hand, voluntary participants of CG were generally accompanies of DG or staff working at the Physical Medicine and Rehabilitation Department. All participants met the following inclusion criteria: they were able to (1) find a target on the screen; (2) follow a moving target; (3) maintain gaze on a stable target; (4) stay focused on tests during experiments. All participants in DG had difficulties to control their hands. Besides, there were five participants in DG with reduced head control. We also applied the mini-mental state examination (MMSE)—30-point questionnaire for cognitive assessment—to validate whether the participants can meet the cognitive ability to complete our tests.

Table 1 Age statistics of the participants according to groups

Full size table

Table 2 Main characteristics of the participants

Full size table

4.2 Apparatus

A laptop computer (Lenovo G505S; CPU: AMD A8-4500M 1.9 GHz; RAM: 6 GB DDR3; screen: LCD 15.6; OS: Windows 10 64 bits; resolution: $1600 \times 900$), an integrated camera (max digital video resolution: $1280\times 720$; Image Sensor Type: 0.3 MP CMOS), and a smartphone with a gyroscope sensor (Sony Xperia XZ1 Compact; CPU: Qualcomm Snapdragon 835; RAM: 4GB; OS: Android Oreo 8.0) were employed for the experiments.

4.3 The SITbench 1.0 benchmark

We used the SITbench 1.0 [7] benchmark which helps researchers to evaluate switch-based systems objectively. By means of this tool, objective evaluation data can be collected and saved automatically with standardized tests. To this end, we employed the Tie-Smiley Matching Game (TSMG) and Hungry Frog Game (HFG) tests of the SITbench 1.0.

4.3.1 TSMG

Briefly, TSMG is a switch-accessible interface based on the automatic linear scanning method where each smiley is highlighted one-by-one for a given scan time. It includes five different templates. As can be seen in Fig. 6, the scanning array of each template consists of 26 smileys in total. Count and order of red and yellow smileys differ for each template. As an indirect selection, the user activates the switch when the highlighted smiley is the red one. A click sound is also provided to the user as an auditory prompt once the target red smiley is highlighted. The mission of the game is to match each smiley with a tie of the same color (i.e., red to red, yellow to yellow). To achieve this, the user activates the switch only if the highlighted smiley is the red one. Figure 6 shows a sample view after the user completed a trial. Confusion matrix variables as true positives (TP), false positives (FP), false negatives (FN), and true negatives (TN) are counted automatically. Then all performance metrics as accuracy, precision, recall and false-positive rate are calculated by the SITbench 1.0 according to the following formulas:

$$ {\text{accuracy}} = \frac{{{\text{TP}} + {\text{TN}}}}{{{\text{TP}} + {\text{TN}} + {\text{FP}} + {\text{FN}}}} $$

(1)

$$ {\text{precision}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FP}}}} $$

(2)

$$ {\text{recall}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FN}}}} $$

(3)

$$ {\text{false}}\;{\text{positive}}\;{\text{rate}} = \frac{{{\text{FP}}}}{{{\text{FP}} + {\text{TN}}}} $$

(4)

4.3.2 HFG

HFG is the other single-switch-accessible test of the SITbench 1.0 (Fig. 7). In a nutshell, a trial includes ten tasks, and each task is achieved in a way that (a) the user does not move until a fly appears on the screen, (b) the user activates the switch as fast as possible once the fly is appeared, (c) a frog eats the fly when the switch is activated. After ten tasks of a trial are completed, the SITbench 1.0 measures the following six evaluation metrics automatically: (1) average press time of all ten tasks, i.e., the average time from when the fly appears to when the switch is pressed; (2) average release time of all ten tasks, i.e., the average time from when the switch is pressed until it is released; (3) the fastest press time within ten tasks; (4) the slowest press time within ten tasks; (5) the fastest release time within ten tasks; (6) the slowest release time within ten tasks. HFG includes five different scenarios. For each scenario of HFG, waiting times (i.e., the time between when the user starts to wait the appearance of a fly and when the fly appears on the screen) differ.

4.4 The SUS questionnaire

The SUS questionnaire [8], which is an industry standard, consists of ten statements with a five-point Likert scale as can be seen in Table 3. Scale values range from 1 to 5 (1 = strongly disagree, 2 = disagree, 3 = neither agree nor disagree, 4 = agree, 5 = strongly agree). A SUS score (ranging from 0 to 100) is calculated based on scale value of the statements in a way that: (1) score contributions of each statement are summed where the score contribution is the scale value minus 1 for statements 1, 3, 5, 7, 9; the score contribution is 5 minus the scale value for statements 2, 4, 6, 8, 10; (2) the sum of the score contributions is multiplied by 2.5 to calculate the SUS score.

4.5 Procedure

At the beginning, the participants were informed about the test verbally. Then, we ensured that the participants and devices were positioned properly. Following a proper positioning, we let them practice the tests (in a counterbalanced order) under our guidance, until they feel confident to start the tests. Afterwards, we applied two tests of the SITbench 1.0 to collect objective data: (1) TSMG: each software switch was tested by each participant ($n=36$) with the first three templates of TSMG where scan time was 1000 milliseconds; (2) HFG: each software switch was tested by each participant ($n=36$) with the first three scenarios of HFG.

We applied the tests in the counterbalanced order to avoid learning and repetition effects. In order to prevent mental or physical fatigue, we allowed the participants to get rest up to 5 min between the experiments. For each participant, it took 15–30 min to complete the experiments including breaks. We have not observed any fatigue in any period of the experiments. At the end of the SITbench 1.0 experiments, we also applied the SUS questionnaire to the participants for quantitative subjective evaluation. Besides, we collected the qualitative subjective data via our observations and participants’ responses of open-ended questions about two software switches proposed within this study.

Table 3 SUS questionnaire statements with average scale values through participant groups

Full size table

4.6 Objective data based results

As can be seen in Fig. 8, according to the results of the TSMG experiments, HeadGyro demonstrated slightly better performance than HeadCam in all performance evaluation metrics (accuracy, precision, recall, and false-positive rate). In terms of accuracy, the mean value of HeadGyro ($m=0.938$) was greater than HeadCam ($m=0.904$), and the difference between mean values was found statistically significant (p < 0.05) according to Student’s t-test for both software switches. For precision, HeadGyro ($m=0.921$) exhibited better performance than HeadCam ($m=0.872$), and there was a significant difference between means (p < 0.05). Regarding recall, HeadGyro ($m=0.910$) was followed by HeadCam ($m=0.863$) with a significant difference between means (p < 0.05) of both interaction techniques. For false-positive rate, HeadCam ($m=0.077$) was ahead of HeadGyro ($m=0.048$), and the difference between means was significant (p < 0.05).

Figure 9 presents the mean values of each software switch for TSMG depending on the participant groups (Mix, DG, CG). CG members performed better than DG members for both software switches according to the mean values through accuracy, precision and recall evaluation metrics. In false positive rate score, DG had higher scores than CG for software switches, which means that DG members made false selections more frequently when compared to CG members. The Student’s t-tests for both interaction techniques through all evaluation metrics was applied to check whether there is a significant difference between the performance of DG members and CG members. The difference between means between DG and CG was not significant for all metrics.

Likewise, HeadGyro proved a better performance in comparison to HeadCam for all HFG evaluation metrics (Fig. 10) (average press time, average release time, the fastest press time, the slowest press time, the fastest release time, and the slowest release time). Mean and p-values of both interaction techniques are presented in Table 4 based on HFG experiments. According to p-values based on the Student’s t-test results of all participants for both interaction techniques, it is demonstrated that there is a statistically significant difference between the means of HeadGyro and HeadCam through all evaluation metrics.

Table 4 Mean values of HeadGyro and HeadCam through HFG evaluation metrics (average press time, the fastest press time, the slowest press time, average release time, the fastest release time, and the slowest release time) for all participants

Full size table

4.7 Subjective data based results

Results of the SUS questionnaire as quantitative subjective data are listed in Table 3. The average scale values acquired from all participants are represented according to HeadGyro and HeadCam through participant groups as mix, DG, and CG. For mix group, the average SUS scores are calculated as 85.0 and 87,9 for HeadGyro and HeadCam, respectively. In DG, the average SUS score is 85,3 for HeadGyro, while it is 87.8 for HeadCam. On the other hand, in CG, the average SUS score is calculated as 84.7 for HeadGyro, while it is calculated as 88.0 for HeadCam. According to the SUS adjective rating scale [60], all SUS scores can be considered as excellent. After the experiments, all participants agreed that both proposed interaction techniques are promising solutions for computer access tasks. They also declared that they were looking forward to experience both software switches to control a computer. Regarding to experiments with the SITbench 1.0, five participants stated that they would perform better if the scanning speed of the TSMG test was set to a slower value, while four participants suggested to increase the size of smileys. All participants were pleased with the visual and auditory sensory feedback provided to the user during tests once the switch is activated or the target is appeared. While 31 of all participants declared that they would prefer to use HeadCam for computer access, 5 of them chose HeadGyro as their favorite software switch. They all agreed that gamification techniques made software switches more engaging. None of the participants experienced any fatigue during tests.

5 Conclusion and discussion

Hands-free computer access via head movements is already a challenging task in comparison to conventional ways, but when it comes to people with limited head control, computer access becomes a more challenging task since the users are obliged to interact with a computer by a single head-gesture like a head nod or a head tilt. On the other hand, the high-cost of dedicated devices—employed by the majority of current head-operated HCI solutions—creates a new barrier, although the aim of universal access is to break the barriers to enable equal opportunity and access for people with disabilities.

Alternative computer access methods can provide several useful services for people with motor disabilities in every part of life, such as communication and education. Any new interaction techniques enabling computer access with minimal head movements will obviously help to enhance the quality of life and the self-sufficiency of people with reduced head control ability alone. Therefore, we proposed two novel interaction techniques namely HeadGyro and HeadCam which depend on the gyroscope sensor of a smartphone and a standard camera, respectively. Both interaction techniques are based on our software switch approach that provides a comprehensive solution to the following problems of the current single head-gesture based interaction techniques: (1) requirement of dedicated devices and (2) compatibility with switch-accessible interfaces. In accordance with the two principles of our software switch approach, HeadGyro and HeadCam software switches (1) do not require any dedicated devices and (2) are configurable to be compatible with switch accessible interfaces. In a nutshell, both software switches can serve like traditional switches by recognizing head movements via a standard camera or a gyroscope sensor of a smartphone to transform them into virtual switch presses.

According to the evaluation data of the conducted usability study with 36 participants (18 motor-impaired, 18 able-bodied), HeadGyro showed slightly better performance than HeadCam in objective evaluation, while HeadCam was rated better than HeadGyro in subjective evaluation. Furthermore, 31 of all participants declared that they would prefer to use HeadCam for computer access, while 5 of them selected HeadGyro. Based on our observations, the reasons behind this situation are as follows: (1) The head control ability is the key factor for this situation. Those who have complete head control ability (31 participants) rated HeadCam, while the ones with reduced head control (5 participants) preferred HeadGyro since it is more sensitive and thus capable of recognizing tiny head movements; (2) Those with complete head control can easily activate the software switch via a standard camera. As expected, wearing a smartphone on head was found an unnecessary solution by the participants as long as their head control capability remains unimpaired or their head movements can be detected by HeadCam. However, HeadGyro can be advantageous if (1) the users cannot move their head enough to be recognized by a camera, or (2) the external factors (e.g., low/high light or any moving object behind the user) cannot be tolerated by camera-based tracking. As can be concluded from the results of objective evaluation, HeadGyro works in a more sensitive way in comparison to HeadCam.

Both software switches can serve as the only low-cost options for people with limited head control who cannot afford the systems depending on high-cost dedicated devices. Beyond head motions, proposed software switches can be quite flexible by recognizing the other body motions to transform them into emulated switch presses. This flexibility also permits the user to change the targeted body motion once the user becomes tired. On the other hand, proposed software switches can also be employed by multi-modal systems as new input techniques beyond the assistive technology area (e.g., as a new input for a computer video game). As another application domain, HeadGyro software switch might be preferred during outdoor activities, since it is quite durable against external factors like low light, high noise, and air conditions. As a future work, any other physical gesture—which is well-controlled by the user—can be targeted to evaluate the efficiency and usability of the proposed interaction techniques. Both software switches can also be employed by a single-switch accessible CAT to see their performance in a real-life scenario.

References

World report on disability. https://web.archive.org/web/20191110112641/https://www.who.int/disabilities/world_report/2011/report.pdf. Accessed 24 Nov 2019
Li, W., Sellers, C.: Improving assistive technology economics for people with disabilities: harnessing the voluntary and education sectors. In: 2009 IEEE Toronto International Conference Science and Technology for Humanity (TIC-STH), pp. 789–794. IEEE (2009)
Borg, J., Östergren, P.O.: Users’ perspectives on the provision of assistive technologies in bangladesh: awareness, providers, costs and barriers. Disabil. Rehabilit. Assist. Technol. 10(4), 301–308 (2015)
Article Google Scholar
de Witte, L., Steel, E., Gupta, S., Ramos, V.D., Roentgen, U.: Assistive technology provision: towards an international framework for assuring availability and accessibility of affordable high-quality assistive technology. Disabil. Rehabil. Assist. Technol. 13(5), 467–472 (2018)
Article Google Scholar
International labour organization. https://web.archive.org/web/20180425014312/http://www.ilo.org/wcmsp5/groups/public/---dgreports/---dcomm/documents/publication/wcms_087707.pdf. Accessed 24 Nov 2019
Esiyok, C., Askin, A., Tosun, A., Albayrak, S.: Software switches: novel hands-free interaction techniques for quadriplegics based on respiration-machine interaction. Univ. Access Inf. Soc. (2019). https://doi.org/10.1007/s10209-019-00645-2
Article Google Scholar
Esiyok, C., Albayrak, S.: Sitbench 1.0: a novel switch-based interaction technique benchmark. J. Healthcare Eng. (2019). https://doi.org/10.1155/2019/5075163
Article Google Scholar
Brooke, J., et al.: Sus-a quick and dirty usability scale. Usabil. Eval. Ind. 189(194), 4–7 (1996)
Google Scholar
Wook Kim, Y., Hyun Cho, J.: A novel development of head-set type computer mouse using gyro sensors for the handicapped. In: 2nd Annual International IEEE-EMBS Special Topic Conference on Microtechnologies in Medicine and Biology. Proceedings (Cat. No. 02EX578), pp. 356–359. IEEE (2002)
Antunes, R.A., Palma, L.B., Duarte-Ramos, H., Gil, P.: Intelligent hci device for assistive technology. In: Doctoral Conference on Computing, Electrical and Industrial Systems, pp. 157–168. Springer (2019)
Sancha-Ros, S., García-Garaluz, E.: Computer access and alternative and augmentative communication (aac) for people with disabilities: a multi-modal hardware and software solution. In: International Work-Conference on Artificial Neural Networks, pp. 605–610. Springer (2015)
Quha zono. https://web.archive.org/web/20200707154711/https://www.quha.com/products-2/zono/. Accessed 7 July 2020
Rosas-Cholula, G., Ramirez-Cortes, J., Alarcon-Aquino, V., Gomez-Gil, P., Rangel-Magdaleno, J., Reyes-Garcia, C.: Gyroscope-driven mouse pointer with an emotiv® eeg headset and data analysis based on empirical mode decomposition. Sensors 13(8), 10561–10583 (2013)
Article Google Scholar
Gerdtman, C., Bäcklund, Y., Lindén, M.: A gyro sensor based computer mouse with a usb interface: a technical aid for motor-disabled people. Technol. Disabil. 24(2), 117–127 (2012)
Article Google Scholar
Zhang, T., Li, L., Yan, H.: The hci method for upper limb disabilities based on emg and gyros. In: 2014 IEEE 13th International Workshop on Advanced Motion Control (AMC), pp. 434–439. IEEE (2014)
Castillo, A., Cortez, G., Diaz, D., Espíritu, R., Ilisastigui, K., O’Bard, B., George, K.: Hands free mouse. In: 2016 IEEE 13th International Conference on Wearable and Implantable Body Sensor Networks (BSN), pp. 109–114. IEEE (2016)
Sim, N., Gavriel, C., Abbott, W.W., Faisal, A.A.: The head mouse—head gaze estimation in-the-wild with low-cost inertial sensors for bmi use. In: 2013 6th International IEEE/EMBS Conference on Neural Engineering (NER), pp. 735–738. IEEE (2013)
Raya, R., Roa, J., Rocon, E., Ceres, R., Pons, J.L.: Wearable inertial mouse for children with physical and cognitive impairments. Sens. Actuators A Phys. 162(2), 248–259 (2010)
Article Google Scholar
Blackmon, F.R., Weeks, M.: Target acquisition by a hands-free wireless tilt mouse. In: 2009 IEEE International Conference on Systems, Man and Cybernetics, pp. 33–38. IEEE (2009)
Headmaster plus. http://web.archive.org/web/20191125001741/https://abledata.acl.gov/product/headmaster-plus-model-hm-2p/. Accessed 11 Dec 2019
LoPresti, E.F., Brienza, D.M., Angelo, J.: Head-operated computer controls: effect of control method on performance for subjects with and without disability. Interact. Comput. 14(4), 359–377 (2002)
Article Google Scholar
Andres, R.O., Hartung, K.J.: Prediction of head movement time using fitts’ law. Hum. Factors 31(6), 703–713 (1989)
Article Google Scholar
Jouse2. https://web.archive.org/web/20191112144721/http://teknyka.com/Jouse2/. Accessed 24 Nov 2019
Kjeldsen, R.: Improvements in vision-based pointer control. In: Proceedings of the 8th International ACM SIGACCESS Conference on Computers and Accessibility, pp. 189–196. ACM (2006)
Tu, J., Huang, T., Tao, H.: Face as mouse through visual face tracking. In: The 2nd Canadian Conference on Computer and Robot Vision (CRV’05), pp. 339–346. IEEE (2005)
Javanovic, R., MacKenzie, I.S.: Markermouse: mouse cursor control using a head-mounted marker. In: International Conference on Computers for Handicapped Persons, pp. 49–56. Springer (2010)
Toyama, K.: Look, ma-no hands! hands-free cursor control with real-time 3d face tracking. PUI98 (1998)
Gorodnichy, D.O., Malik, S., Roth, G.: Nouse -use your nose as a mouse- a new technology for hands-free games and interfaces. In: Proceedings of Internationl Conference on Vision Interface (VI’2002), pp. 354–361 (2002)
Morris, T., Chauhan, V.: Facial feature tracking for cursor control. J. Netw. Comput. Appl. 29(1), 62–80 (2006)
Article Google Scholar
Kumar, R., Kumar, A.: Black pearl: an alternative for mouse and keyboard. J. Graph. Vis. Image Process. 8, 1–6 (2008)
Google Scholar
El-Afifi, L., Karaki, M., Korban, J., al Alaoui, M.A.: Hands-free interface-a fast and accurate tracking procedure for real time human computer interaction. In: Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, pp. 517–520. IEEE (2004)
Pallejà, T., Soler, E.R., Teixidó, M., Tresanchez, M., Del Viso, A.F., Sánchez, C.R., Palacin, J.: Using the optical flow to implement a relative virtual mouse controlled by head movements. J. UCS 14(19), 3127–3141 (2008)
Google Scholar
Su, M.C., Su, S.Y., Chen, G.D.: A low-cost vision-based human-computer interface for people with severe disabilities. Biomed. Eng. Appl. Basis Commun. 17(06), 284–292 (2005)
Article Google Scholar
Lin, Y.P., Chao, Y.P., Lin, C.C., Chen, J.H.: Webcam mouse using face and eye tracking in various illumination environments. In: 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, pp. 3738–3741. IEEE (2006)
Varona, J., Manresa-Yee, C., Perales, F.J.: Hands-free vision-based interface for computer accessibility. J. Netw. Comput. Appl. 31(4), 357–374 (2008)
Article Google Scholar
Betke, M., Gips, J., Fleming, P.: The camera mouse: visual tracking of body features to provide computer access for people with severe disabilities. IEEE Trans. Neural Syst. Rehabilit. Eng. 10(1), 1–10 (2002)
Article Google Scholar
Song, Y., Luo, Y., Lin, J.: Detection of movements of head and mouth to provide computer access for disabled. In: 2011 International Conference on Technologies and Applications of Artificial Intelligence, pp. 223–226. IEEE (2011)
Perini, E., Soria, S., Prati, A., Cucchiara, R.: Facemouse: A human-computer interface for tetraplegic people. In: European Conference on Computer Vision, pp. 99–108. Springer (2006)
Alhamzawi, H.A.: Control mouse cursor by head movement: development and implementation. Appl. Med. Inform. 40 (2018)
Kim, H., Ryu, D.: Computer control by tracking head movements for the disabled. In: International Conference on Computers for Handicapped Persons, pp. 709–715. Springer (2006)
Strumiłło, P., Pajor, T.: A vision-based head movement tracking system for human-computer interfacing. In: 2012 Joint Conference New Trends In Audio & Video and Signal Processing: Algorithms, Architectures, Arrangements and Applications (NTAV/SPA), pp. 143–147. IEEE (2012)
Kim, S., Park, M., Anumas, S., Yoo, J.: Head mouse system based on gyro-and opto-sensors. In: 2010 3rd International Conference on Biomedical Engineering and Informatics, vol. 4, pp. 1503–1506. IEEE (2010)
Manresa-Yee, C., Varona, J., Perales, F.J.: Towards hands-free interfaces based on real-time robust facial gesture recognition. In: International Conference on Articulated Motion and Deformable Objects, pp. 504–513. Springer (2006)
Loewenich, F., Maire, F.: Hands-free mouse-pointer manipulation using motion-tracking and speech recognition. In: Proceedings of the 19th Australasian Conference on Computer-Human Interaction: Entertaining User Interfaces, pp. 295–302. ACM (2007)
Headmouse nano. https://web.archive.org/web/20191006105058/https://www.orin.com/access/headmouse/. Accessed on 24 Nov 2019
Walsh, E., Daems, W., Steckel, J.: An optical head-pose tracking sensor for pointing devices using ir-led based markers and a low-cost camera. In: 2015 IEEE Sensors, pp. 1–4. IEEE (2015)
Trackerpro. https://web.archive.org/web/20160731023751/https://www.ablenetinc.com/trackerpro. Accessed 24 Nov 2019
Smartnav. https://web.archive.org/web/20190607100209/http://www.naturalpoint.com/smartnav/. Accessed 24 Nov 2019
Li, S., Ngan, K.N., Sheng, L.: A head pose tracking system using rgb-d camera. In: International Conference on Computer Vision Systems, pp. 153–162. Springer (2013)
Head switch. http://web.archive.org/web/20191125002431/https://enablingdevices.com/product/head-switch/. Accessed 11 Dec 2019
Easy flex dual ultimate switch. http://web.archive.org/web/20191114040944/https://enablingdevices.com/product/easy-flex-dual-ultimate-switch/. Accessed 24 Nov 2019
Gorodnichy, D., Dubrofsky, E., Mohammad, A.: Working with a computer hands-free using the nouse perceptual vision interface. In: International Workshop on Video Processing and Recognition (VideoRec’07) (2007)
Fu, Y., Huang, T.S.: Hmouse: head tracking driven virtual computer mouse. In: 2007 IEEE Workshop on Applications of Computer Vision (WACV’07), Austin, TX, pp. 30. IEEE (2007). https://doi.org/10.1109/WACV.2007.29
Statista: Smartphone users worldwide 2016-2021. https://web.archive.org/web/20191120021523/https://www.statista.com/statistics/330695/number-of-smartphone-users-worldwide/. Accessed 24 Nov 2019
Bröhl, C., Rasche, P., Jablonski, J., Theis, S., Wille, M., Mertens, A.: Desktop pc, tablet pc, or smartphone? an analysis of use preferences in daily activities for different technology generations of a worldwide sample. In: International Conference on Human Aspects of IT for the Aged Population, pp. 3–20. Springer (2018)
Cao, H., Lv, H., Sun, Q.: Model design based on mems gyroscope random error. In: 2015 IEEE International Conference on Information and Automation, pp. 2176–2181. IEEE (2015)
Ruan, X.g., Yu, M.m.: Modeling research of mems gyro drift based on kalman filter. In: The 26th Chinese Control and Decision Conference (2014 CCDC), pp. 2949–2952. IEEE (2014)
Kownacki, C.: Optimization approach to adapt kalman filters for the real-time application of accelerometer and gyroscope signals’ filtering. Digital Signal Process. 21(1), 131–140 (2011)
Article Google Scholar
Xue, L., Jiang, C.Y., Chang, H.L., Yang, Y., Qin, W., Yuan, W.Z.: A novel kalman filter for combining outputs of mems gyroscope array. Measurement 45(4), 745–754 (2012)
Article Google Scholar
Bangor, A., Kortum, P., Miller, J.: Determining what individual sus scores mean: adding an adjective rating scale. J. Usabil. Stud. 4(3), 114–123 (2009)
Google Scholar

Download references

Acknowledgements

Open Access funding provided by Projekt DEAL. We would like to thank Dr. Brijnesh Jain and Dr. Fikret Sivrikaya for their valuable suggestions. The authors are grateful to the participants for their valuable time. The first author holds the Ministry of National Education Scholarship of the Turkish Republic.

Author information

Authors and Affiliations

Distributed Artificial Intelligence Laboratory, Technische Universität Berlin, Berlin, Germany
Cagdas Esiyok & Sahin Albayrak
Physical Medicine and Rehabilitation Department, Izmir Katip Celebi University, Izmir, Turkey
Ayhan Askin & Aliye Tosun

Authors

Cagdas Esiyok
View author publications
You can also search for this author in PubMed Google Scholar
Ayhan Askin
View author publications
You can also search for this author in PubMed Google Scholar
Aliye Tosun
View author publications
You can also search for this author in PubMed Google Scholar
Sahin Albayrak
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cagdas Esiyok.

Ethics declarations

Conflicts of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Esiyok, C., Askin, A., Tosun, A. et al. Novel hands-free interaction techniques based on the software switch approach for computer access with head movements. Univ Access Inf Soc 20, 617–631 (2021). https://doi.org/10.1007/s10209-020-00748-1

Download citation

Published: 08 July 2020
Issue Date: August 2021
DOI: https://doi.org/10.1007/s10209-020-00748-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Novel hands-free interaction techniques based on the software switch approach for computer access with head movements

Abstract

Similar content being viewed by others

Development and Evaluation of a Mouse Emulator Using Multi-modal Real-Time Head Tracking Systems with Facial Gesture Recognition as a Switching Mechanism

A Head Mouse Alternative Solution Proposal for People with Motor Impairments: Design and Usability Assessment Study

Evaluation of a Mobile Head-Tracker Interface for Accessibility

1 Introduction

2 Related works

2.1 Head-operated interaction techniques without a single head-gesture access support

2.2 Head-operated interaction techniques with a single head-gesture access support