Keywords

1 Introduction

There are several ways to hold a tablet PC: (1) putting it on the palm, (2) grasping it with one end, (3) holding it with the arm, and (4) grasping both ends of it. To a tablet PC with over 12 in. display, (1) is dangerous. (2) is possible, but since the wrist gets tired, it is limited to a very short time. (3) is available. However, Maniwa et al. [1] reported that fatigue of the arm is proportional to the screen size, from an experiment of entering characters for 5 min with the right hand on the tablet PC held with the left hand. From this report it is expects that the arm gets fatigued soon.

Grasping both ends of a tablet PC is suitable for holding a large size tablet for a long time. However, in this holding style it is impossible to touch the center area of screen of the tablet PC. It makes a serious problem in character input with software keyboard. To solve this problem, we have developed a character input system in which characters and control codes are entered by gesture done by thumbs.

2 Related Studies

Several software keyboards have been developed that use edge region of screen. Flick keyboard and multi-tap keyboard are placed on the lower half of the screen on smartphones, but on the Tablet PC, they are allocated at the left or right end of screen. With the flick keyboards [2, 3] included with Windows or iOS, you can also display keys on both sides.

A.I.type tablet keyboard [4] and iPad keyboard [5] support the option of displaying a QWERTY keyboard separated to two parts on the both ends of a screen. Using this option, you can touch any key without hand off the tablet. However, to touch the keys placed on the center of QWERTY keyboard, you need to extend the thumb to the maximum. For this reason, QAZ keyboard [19] is also proposed, that is a vertical QWERTY keyboard rotated 90° from the usual direction. In the arc mode of the Microsoft Word Flow Keyboard [6], a keyboard curved in an arc shape is displayed on the screen so that you can touch each key by swinging and bending or extending the thumb.

KALQ keyboard [16] is a separation keyboard with a different key arrangement than QWERTY. Since number of the keys allocated to each side is almost equal the flick keyboard, operation while gripping a tablet is possible. A screen keyboard simulating tagtype keyboard [7, 8] was also developed. With this keyboard, you input a character by touch two keys at the same time, one of which is at the right side and another is at the left side. This key boards can be operated with thumbs while grasping both edges of a tablet.

On all the keyboards shown above, it is necessary to assign 10 or more keys to the reachable area with the thumb. Since each key must be small, touch-typing is difficult. Viewing angle of the diagonal line of 13 in. screen is about 37°, when the distance from the eye to the screen is 50 cm. On the other hand, central visual field of human is 20°. This means it is impossible to keep the entire screen of a tablet PC simultaneously inside the central visual field. Therefore, unlike smartphones, it is necessary to input characters by alternately looking at the tip of the finger and the input characters. This often causes eyestrain.

Stroke gestures do not require following the position of finger tips by eyes. In Minimal Device Independent Text Input Method (MDITIM) [9], a character is defined by combination of horizontal and vertical strokes. For this reason, you can enter characters without looking at your fingertips. In LURD-Writer [10] and H4-Writer [11], characters are defined as a sequence of four regions, top, bottom, right, and left, where a fingertip passes. The four areas are assigned around the square field where you perform gestures. In these method, since the area is selected by the position of fingertips, holding position of the tablet PC must be fixed to avoid increasing erroneous operations. In these method, fast character input is available after learning the sequences. However, since that sequence is defined with priority on speed, long training is required for the learning.

Each gesture of EdgeWrite [12, 13] is defined as a sequence similar to a character stroke. But each stroke of the sequence must be connecting two of four corners of square input area. In addition, the sequence must be entered in continuous. Therefore, the shape of the sequence is different from the shape of the original character and unnatural. The stroke can be specified in the moving direction. However, since it is necessary to distinguish between 8 directions, accurate motion of fingertip is required. In character input for Japanese, user must learn all sequences that are assigned to 46 Katakana characters of Japanese [18].

Some eyes-free input methods are developed for smartphone. For example, with Fukatsu’s No-look Flick [14], you can enter a Hiragana with a pair of flick actions, each of which is done in one of the three areas. When you grasp a smartphone in normal way, you can touch anywhere of screen with your thumb. Since the edge of the screen can be recognized by the feel of the hand, you can touch in the area in require. It is possible to apply this method to tablet PC. However, an additional method is necessary that limits holding position or shifts the touch detection area in corresponding to the holding position. This is because, on a tablet PC grasping with both hands, the touchable area is a limited part of the screen and moves depending on the position to grasp.

Aoki’s Drag&Flick [15] does not use position, so the additional method is not required. However, since drag selects one from eight directions, accurate motion is required. Since the flick is consecutively done from the position where the drag is ended, if their directions are the same, the moving distance becomes longer. Conversely, in order to be able to execute the motion within the reach of the thumb, it is necessary to shorten each stroke of the motion. As a result, input error increases by recognizing wrong direction and detecting extra/missing strokes.

3 Proposed Method

3.1 User Interface

In our system, a user inputs character by stroke gestures which are done by thumbs in the state of grasping both left and right ends of a large tablet PC. Japanese and English characters are entered with the dominant hand. The area for characters is assigned at the right end by default. It is the inside of the vertically elongated rectangle drawn with the white line in Fig. 1. Since the position at which the tablet PC is gripped varies depending on a person, the input area is set large in the vertical direction. In the horizontal direction, the area where the fingertip of thumb can reach while grasping the tablet is covered. The area is transparent except the borderline in order to not hide the display behind. The borderline can also be invisible by option. The inside of the rectangle with a black edge at the left end is the area for control characters. Here control codes for text editing are entered. These input areas can be interchangeable by modifying the system setting table.

Fig. 1.
figure 1

User interface of the system. Inside of the white rectangular frame located right edge is the area for entering characters. The black frame at the left end is for control codes. The white squares at the upper right and the upper left are the input guides

The icons behind of the input areas are visible, but they are not selectable. This is because every operation in the input areas is recognized as input to the character input system. In order to improve the usability of the system, the function is implemented that the input areas automatically shrink if nothing is entered for a constant number of seconds. When the fingertip touches the shrunk area, it immediately returns to its original size. Accordingly, the character input can be started from the shrunk area with no difference from the original input area. In addition, the shrunk area can be temporary hidden by double tapping it. This state is canceled by tapping the shrunk area on the opposite side and the input areas returns to the normal size. By using the function, any position on the display can be touched under running the character input system.

The white rectangles at the top of each input area are the input guide. It is prepared for supporting beginner and indicating the stage of the character input. Here, at each stage of the input operation, selectable characters or codes are displayed by moving the fingertip in the direction. Its details will be described in Sect. 3.3.

3.2 Design of Gestures

As shown in Fig. 2(a), we defined that one stroke is as a motion for a certain distance to either up, down, left or right direction. Since the stroke is recognized by only its moving direction, it is not affected by the first touch position. A gesture is defined by connecting the strokes, however, connection to the same direction is prohibited. Therefore, when moving more than a certain distance in one direction, it can be determined as one stroke regardless of the moving distance. From the reason, it is not necessary to check the position or distance of the fingertip with vision. In addition, each gesture is designed inside the range of one stroke in vertical and horizontal from its start position. This is because all gestures complete within the area where the fingertip of thumb reaches easily.

Fig. 2.
figure 2

Design of gestures. Each gesture is defined by connecting strokes of different orientation. They are limited in the one stroke area centered on the starting position.

3.3 Assignment of Characters

The operation of entering one character consists of two steps. First select a group of characters with a stroke gesture, and then select a member of that group by a tap or a flick. In Hiragana (cursive in Japanese) mode, each row of the table of Japanese syllabary is allocated to one group. This is because the table of Japanese syllabary is very familiar to Japanese speaking people. Since number of the 2-stroke gestures is 12 as shown in Fig. 2(b), they are allocated to the 10 rows of the table of Japanese syllabary in clockwise from the left. The allocation is shown in the left table of Fig. 3. By adding one stroke to the gesture for a Hiragana group as shown in the middle table of Fig. 3, the row for characters with the voiced sound mark or the semi-voiced sound mark is selected. The small characters are also selected in the same way.

Fig. 3.
figure 3

Gestures of characters. These are entered by the dominant hand in two steps. First select a group of characters with a stroke gesture, and then select a member of that group by a tap or a flick.

The grouping of alphanumeric well-known by Japanese people is not existing. The assignment of alphabet letters to the phone keys has been inherited to the flick keyboard. However, it is difficult to memorize this grouping, because one number and three or four alphabets are combined together and assigned to one key. In alphanumeric mode of our system, numbers and alphabets are separated. Numbers are divided into groups of 1–5 and 6–0. These groups are assigned to gestures going to the left in the first stroke. The alphabets are divided into the first half and the second half. The upward stroke is assigned in the first half, and the downward stroke is assigned in the second half. Next, each half is divided into three groups in alphabetical order. And left, return, or right strokes are added to the respective group. Therefore, to alphanumeric characters, gestures shown in the right table of Fig. 3 are assigned.

The assignment of alphabets is straightforward, but it was expected that group selection mistakes occur more than Hiragana input. To reduce the cost correcting the group selection error, an operation to move to the previous or next group was added to the control code. This operation is done in the opposite input area.

The control codes are entered in the input area opposition of the character input area. Each gesture of the control codes is performed by the thumb of the no-dominant hand in one step. Gesture assignment of the control codes is shown in Fig. 4. Entering the control code is prohibited between the first and the second gestures of character inputting. Instead, operation that assists character input is available. By right or left flick, the group selection is canceled. In the Hiragana mode, if there is a row with a voiced or semi-voiced sound mark on the selected row, you can change it to that. In the alphanumeric mode, you can shift the group to its previous or next.

Fig. 4.
figure 4

Gestures of control codes for text editing. They are entered by not-dominant hand in one step. “” is used to switch character code to 1 byte or 2 bytes. “” is used to convert Hiragana to Kanji. “” are normal Hiragana characters. “” and “” are characters attached voiced sound mark or semi-voiced sound mark to “”, respectively.

Nothing is assigned to the tap in the group selection and the control code input. Therefore, no operation is entered when you put your thumb on the screen without motion. This enables pinching the tablet PC with one hand while entering characters with the another hand. You do not have to worry about dropping a tablet PC, even if you move your thumb intensely. This setting is useful for temporarily releasing your hand to touch the screen, too.

3.4 Input Guide

The input guide is prepared to assist beginners’ input. Its image is divided to 4 sectors and the center square. In each sector, selectable characters by moving the thumb to its direction are displayed. Figure 5(a) is the initial guide of the alphanumeric mode. Here all selectable groups are shown. When the up stroke is performed, display of the guide changes to Fig. 5(b). Three groups which are written in the top sector of the initial guide are shown in the left, down, and right sector. Since the connection to the same orientation is prohibited, the up sector is empty. Adding the left stroke, the guide changes to Fig. 5(c). The A to E group is written in the center square. By releasing the thumb from the screen when a group is displayed at the center square, that group is selected.

Fig. 5.
figure 5

Display of the guide of alphanumeric mode. The display varies ever time when a stroke is recognized. Each sector shows electable characters by moving the thumb to its direction. The group or character written in the center area is selected by releasing the thumb.

After that display of the guide changes to Fig. 5(d). It is an image for the member selection. This guide indicates the character written in the square at center is entered by tap and the character written in each sector is entered with flick in that direction. Even if you do not remember the gestures of a character, you can enter the character by moving the thumb in the direction shown in the guide after understanding the rules of the guide.

In Fig. 5(c), all 4 sectors are empty. But if groups which are selectable by adding strokes is exist, they are shown in some of the sectors. For example, when you select “” row in the Hiragana mode, “” and “” are display in the sectors as shown in Fig. 6.

Fig. 6.
figure 6

It is the guide of Hiragana mode. This figure shows the transition of the guide till “” row is selected. Since “” row can be changed to both the voiced row and the semi-voiced row, they are shown in the bottom sector and the left sector of the right image, respectively.

4 Experiment and Discussion

4.1 Input Speed of Beginners

Input speed of 6 university students who have never used this system was measured. They are familiar the flick method because of using smartphone every day. They are familiar also for PCs but not for tablets.

This experiment was done using ASUS tablet Eee Slate B121. Its speck is Windows7 Professional 32 bit, Intel corei5-470UM 1.33 GHz, 4 GB memory, 1,280 × 800 dots 12.1 in. display, 1.1 kg. The width of the input area is 220 pixels, and both the height and width of the input guide is 200 pixels. The distance recognizing the stroke is 30 pixels.

The subject sits on the chair in relax and holds both end of the tablet PC with hands. There is no additional limitation on the position and the shape of the hands holding the tablet. The window to display task words is located at center top of the screen of the tablet. The entered characters are shown on the window just under the task window. The input guide is displayed at the upper left corner and upper right corner of the screen.

The experiment is carried out in the following.

  1. 1.

    Listen to the 5 min description.

  2. 2.

    Enter the words of the first task.

  3. 3.

    Rest for 5 min.

  4. 4.

    Practice inputting characters for 10 min.

  5. 5.

    Enter the words of the second task.

  6. 6.

    Repeat steps 3 to 5 until the fifth task is completed.

In the 5 min description, how to input characters and how to use the input guide have been lectured. Prior to this, subjects have no knowledge on this system. The subjects have been instructed to enter as accurately as possible, and to correct input errors when find them.

In one task, five word are entered. Each word is displayed in Hiragana one by one in the task window. A subject enters the Hiragana characters with the system, and then finishes the input of the word by entering the Enter code. At just after, the next word is displayed. When all 5 words are entered, one task is completed.

The difference between the time at which the Enter code was entered and the time at which display of the task word was started is the input time of the word. It includes time for correcting errors also. Sum of the input time of the 5 words is the input time for the task. Input speed, that is character per minute (CPM), is calculated by dividing the number of the characters entered during the task by the input time.

The words used in the tasks were chosen from the “everyone’s Japanese words” published in the website “Japanese teacher’s teaching” [17]. The words that are in 3 to 6 letters in Hiragana notation were picked up, then the words with same hiragana notation were combined into one. Number of the words is 360. They include voiced sound letters, semi-voiced sound letters, lower case letters, and the prolonged sound symbol. These words are sorted in random order for each subject, and each five words from the top are presented as one task.

In the 10 min practice, subjects freely enter words with the system. Training tasks are not offered. Nothing is displayed in the task window during the practice.

Input speed of each of the 6 subjects and their average is show in Fig. 7. At the third trial after 20 min practice, the input speed reached 18.5 [CPM]. The speed of the last 2 tasks is about 20 [CPM]. The average input speed is slow at the beginning, but it is stabile after the third task. The standard deviation is reduced. Therefore, it is considered that the subjects understood the usages of the input system in 20 min.

Fig. 7.
figure 7

Input speed of 6 beginners and the average of them. The average after the third task is almost constant. The speed is nearly 20 [CPM].

Input error is evaluated with the Total Error Rate (TER) [20]. It is calculated by dividing the number of incorrectly entered characters including those corrected during the task, by the number of characters in the task. The average of the 6 subjects is shown in Fig. 8.

Fig. 8.
figure 8

The average of the total error rate of the beginners.

At the first and the second tasks, 18% of input characters was wrongly input but almost all were corrected. The input speed of the second task is faster than the first task. This is probably due to the fact that the speed of the fingertip has increased. At the third and fourth tasks, in proportion to the decreasing of the error rate, the input speed in-creased. At the last task, the error rate increased, however the input speed was hardly changed. The reason was expected that the speed of the finger tips became too fast by the accustomed to the operations. The reason is probably because the speed of the fingertip became too fast to do accurate operations. The fact that the movements of the fingertip became rough means that the subjects have gotten used to the input method.

4.2 Input Speed of a Skillful Person

The input speed of a person skilled the system was measured. The experimental procedure is similar to that of the beginners, but the description step and the practice step are omitted. The input speed was measured with the hand of the skillful person hidden. In this experiment, the tablet is operated under the table, so the subject cannot see the fingertip. But since what appears on the tablet is cloned on the monitor placed on the table, so he can see the display on tablet. The average of the 5 tasks was 57.9 [CPM] (see Table 1). Since the standard deviation is 4.74, the input speed is stable. Error rate of all tasks was 5.9%. This result shows that characters can enter without looking at fingertips if skilled in the system.

Table 1. The input speed of Hiragana characters of a skillful person when his hands are hidden. SD in the table means the standard deviation.

4.3 Speed of Beginners When Entering Alphabets

Input speed of alphabet words was compared to that of Hiragana words. Number of test subjects is 6. They are beginners but different people from the previous experiment. The experiment for Hiragana and that for alphabet were done about one week apart in order to reduce the influence of memorizing the operation method. Which method to do first was randomly determined for each subject. In each of the experiments, firstly a 3 min practice is done. Then 5 tasks inputting 5 words are repeated with a 3 min break time in between.

The alphabet words were selected from the English words which Japanese junior-high students must learn. Each of the word is 3 to 5 letters. All words were entered in lowercase letters. The Hiragana words were selected from the word set in the Sect. 4.1. However, the word including voiced sounds, semi-voiced sounds, and lowercase letters are eliminated in order to make the number of strokes of gestures in English words and Hiragana words equal.

The result is shown in Table 2. Since input speed is depended on each person, the ratio of speed in alphabet input to speed in Hiragana input was computed. The ratio is written in the second column from the right of the table. For all subjects except B, the ratio exceeds 90%. Average of 6 subjects is about 95% and the standard deviation of the ratio is 3.7%. The input speed of alphabet is slightly slower than that of Hiragana. But the difference in speed is not statistically significant, because the hypothesis that hiragana and alphabet input speed are equal was not rejected by the t test with a significance level of 5% for 4 subjects except B and D.

Table 2. Comparison of the speed inputting Hiragana and alphabet. Each number labeled “average” is the average input speed of the 5 tasks. SD means the standard deviation of input speed of the 5 tasks.

The error rate was small for many subjects in this experiment as shown in Table 3. There was no clear difference was not appeared between the Hiragana input and the alphabet input. Redo of character group selection occurred in both Hiragana and alphabet input. However, as we expected, in 3 of the 6 people, the mistakes in alphabet input was more than Hiragana input. Because 25 of 27 mistakes in alphabet input have been corrected with the group shift, its implementation was effective.

Table 3. Error rate is the average of the 5 tasks. The redo in the group selection is not including in the error input because the mistakes is repaired before character is entered. The number in parentheses of the right column is the number of times corrected by the group shift.

5 Conclusion

We developed the character input method for tablet PCs. The method is operated with the thumb while grasping both end of the tablet. One character is entered by two steps. First, group of 5 characters is selected by a stroke gesture, and next, a member of the group is selected by the flick or tap. Since the gesture is designed to be completed inside the one stroke area, you can operate in the range where your fingertip can reach without releasing your hand.

User can input a character by moving the thumb according to the display of the guide, even if he/she does not memorize the gesture of the character. In the experiment, beginners understood the usage of this input system in 20 min. They inputted characters about 20 [CPM] in average, after 30 min practice. The input speed of the skillful person was about 58 [CPM] without watching fingertips. The input speed of alphabet words was almost the same as the input speed of Hiragana. The operation shifting alphabet groups was used almost every times when group selection error happened.