Keywords

1 Background

In recent years, expert skill persons (experts) are decreasing year by year in traditional performing arts in Japan due to the aged society and the year 2012 problem, i.e., the concurrent retirement of the postwar baby-boom generation. The falling birth rate affects the decrease in heirs to the traditional performing arts. These situations raise a concern about the danger of decline and disappearance of experts’ skills in the Japanese traditional performing arts.

Skills encompass tacit knowledge acquired through experience or by intuition. Tacit knowledge cannot be expressed with characters or figures. It is knowledge difficult to be conveyed in the form of character information. When learning skills, an heir (novice) is required to “hear, see, and learn” them without being taught by a expert. When a novice copies the postures and actions of a expert, his/her skill acquirement begins. Skills are handed down through bodily learning experience just like a novice repeatedly copies the actions of a expert to acquire a skill without even noticing. Since experts have skills obtained through bodily experience, it is difficult to express such skills in the form of explicit knowledge such as characters and figures. This makes it difficult to hand down skills to novices. This is considered to be the cause of loss of skills. Another problem is that it takes a long time to hand down skills because it is necessary to repeatedly copy the skills.

As a solution to this problem, many studies have tried to convert skills into explicit knowledge in the preservation and handing down of skills. The idea here is converting individuals’ tacit knowledge into clearly-expressed explicit knowledge to hand down skills. However, it is also difficult to extract skills that are tacit knowledge to convert them into explicit knowledge.

2 Related Work

Ando et al. [1] created a performance teaching aid using motion capture and virtual space. They proposed a method based on virtual reality using 3-D space on the basis of “performance observation” and “watching video teaching aids” that have been basic methods in conventional performance teaching. Specifically, they created a teaching aid based on a motion-capture technology that allows learners to observe an action from a desired angle. They used a motion capture system to shoot actions for the teaching aid, calculated 3-D joint coordinates, and generated a human model. This human model made it possible to see the difference in actions between a novice and a expert from various angles. This study showed that the teaching aid was useful as a performance teaching aid because the motion-capture technology made it possible to compare difference in actions between a novice and a expert.

Yamaguchi et al. [2] conducted a study to compare the Japanese drum playing skills between a expert and a novice. They compared arm actions that a expert and a novice took when playing the Japanese drum. As for the arm actions, they shot the movements of the head, shoulder, elbow, wrist, and waist on the right side of each test subject and the movement of a drumstick to analyze them using image processing. Besides, they used surface electromyography to measure muscle activities of the anterior deltoid, biceps, triceps, extensor carpi radialis, and extensor carpi ulnaris on the right side of each test subject.

As a result of the analysis, the expert had large angle variations at joints from the shoulder to the wrist whereas the novice had small ones. The expert’s drumstick swing was fast. The analysis thus revealed that drumming actions varied between the experts and novices.

As for the expert, muscle activities were detected from all the measured muscles. By contrast, as for the novice, only the upper arm muscles were active. This study thus showed that their upper arm actions were different in playing the Japanese drum, and they had different skill levels.

These studies started with recording the actions of a novice and a expert with the use of various sensors etc. Then, researchers analyzed recorded data by using knowledge on analysis subjects and various analysis tools to extract the skills of the expert. However, researchers cannot extract all skills. They can extract only part of skills that they focus on.

3 Research Target and Proposal

The Japanese drums are broadly classified into the following three types: most common ones called long trunk drums (imperial drum) whose trunk is made of a hollowed-out log; those with a relatively lightweight trunk called tub drums; and tight drums for high-pitched sound. As a drum for the present study, we chose a tight drum that plays a role in keeping overall rhythm. Since tight drums are responsible for overall beats, players are required to have the highest skills and there is a large skill difference between experts and novices.

If it is possible to record the movements of experts and novices in traditional performing arts to discover movements common to many experts, they are considered to be skills in the traditional performing arts. We aim at automated skill extraction so that skills can be extracted without researchers.

The aim is to extract only experts’ “skills” without “individual habits” by extracting data of postures and actions of expert Japanese drum players and then conducting data mining of the collected data. In order to discover the skills of the experts, we obtain data of postures and actions of novices to compare them with the data of the experts to extract skills. As a stage before automated skill extraction, in the present study, we examine whether it is possible to extract skills from limited data.

4 Pre-survey

At the beginning of the present study, we asked cooperation from three Japanese drum organizations to conduct a pre-survey to collect information on Japanese drums including basics. The survey items were “difference between experts and novices,” “time required for mastering Japanese drum playing,” and “first practice for novices.”

4.1 Difference Between Experts and Novices

In order to find if a player is a novice, experts check “how to hold the drumstick,” “how to swing the drumstick,” and the “posture for playing the drum.” The basic way to hold the drumstick is holding it without strain and holding it tightly right before the drumstick hits the drumhead. The way to swing the drumstick is swinging the drumstick down without strain and with surrendering to the gravity. By using the wrist to snap a stroke like whipping right before the drumstick hits the drumhead, you can make a large sound. The posture for playing the drum is keeping your feet apart about 1.5 times the shoulder width and keeping your waist stably when hitting the drumhead. Besides, we learned that players consciously hold the upper body basically with a straight back.

4.2 Time for Mastering Japanese Drum Playing

This varies depending on the frequency of practice and a sense of a player. We obtained an answer stating that at least 5 years are required in a case of two practices per week. In the present study, players with over five years’ playing experience are defined as experts.

4.3 First Practice for Novices

We were told that a main practice method is orally teaching novices the aforementioned three items: “hot to hold the drumstick,” “how to swing the drumstick,” and the “posture for playing the drum,” and then making them actually play the drum to correct their play.

5 Overview of Recording System

The system recorded “how to hold the drumstick,” “how to swing the drumstick,” and the “posture for playing the drum” that we learned in the interview with experts. By reference to the studies of skill extraction using motion-capture technology by Ando et al. and Fujimoto et al., we used an acceleration sensor, pressure sensor, motion capture sensor, etc. to record the experts’ drum beating actions. These sensors made it possible to accurately record their actions [3].

5.1 Recording Method and Measurement Sites

The postures and actions of the experts were measured by use of a motion capture sensor in the form of 3-D spatial coordinate data. Obtaining the 3-D spatial coordinates of persons allows grasping accurate positional relations such as depth. This was considered to improve skill acquisition. With the use of the 3-D special coordinate data extracted with the motion capture sensor, skeletal data of the persons was generated.

Besides, the speed of drumstick swings and grip strength data were measured with the acceleration and pressure sensors since they cannot be obtained only from information from the motion capture sensor.

As data of an action that a expert takes when beating the drum, the angle of the raised arm, the movements of the wrist joint and elbow joint, and the speed of a drumstick swing were extracted. These parameters were determined by reference to the study of Yamaguchi et al. Yamaguchi et al. stated that these parameters showed a significant difference between the experts and novices. Besides, Japanese drum instructors also stated that these parameters were most likely to show a difference between the experts and novices. We further measured lower body postures. We then extracted standing positions and waist positions relative to the drum from the lower body postures. Although Yamaguchi et al. analyzed muscle activities with an electromyograph, we did not use it because muscle activities are considered to show in actions.

5.2 Motion Capture Sensor

We adopted Microsoft’s motion capture sensor “Kinect” to obtain the 3-D spatial data of actions. These sensors calculate depth information with a near-infrared camera to estimate the 3-D coordinates of joints of the body. Kinect can build skeletal information on the basis of the 3-D coordinates to track a skeletal structure in real time. Kinect makes it possible to easily obtain a person’s 3-D joint data in real time without using a marker etc.

Kinect for Windows SDK was adopted as the driver for Kinect. This driver allows obtaining videos with a video camera, obtaining depth images, and tracking skeletal information. This driver has the following sufficient performance for the present study: Frame rate 30 fps; Maximum pixels of depth images 640 × 480; Distance for capturing persons from 0.8 m to 4.0 m; and capable of tracking skeletal data of six persons simultaneously.

5.3 Acceleration Sensor

A triaxial acceleration sensor (KXM52-1050) was used in measuring the speed of drumstick swings. This acceleration sensor can measure x, y, and z-way accelerations and inclinations. It detects accelerations between -2G and + 2G.

5.4 Pressure Sensor

A pressure sensor (FSR406) was used in measuring grip strengths. This is a 43.69 mm square sensor. It is attached to the grip of the drumstick so that grip strengths are measured. These sensors are controlled by Arduino.

6 Evaluation Methodology

The figure shows how apparatuses were set up. Kinect was set up in a position 2.25 m ahead of the player on the left by 45 degrees. Kinect was set up at a height of 0.8 m from ground. The video camera shot the player’s beating action. The player played the Japanese drum with a drumstick to which the acceleration and pressure sensors were attached. We instructed the player to play the drum so that the acceleration sensor attached to the drumstick faced upward. The frame rate for video shooting was 30 fps and the sampling rate for sensors was 30 Hz.

We recorded actions during playing of 22 players (experts) with over 5 years’ experience from four organizations including the three organizations that cooperated with us in the pre-survey. We also recorded actions during playing of 22 novices after orally teaching them “how to hold the drumstick,” “how to swing the drumstick,” and the “posture for playing the drum.” We asked both experts and novices to play the drum at a speed of BPM = 60 for 30 s. We adopted a frame rate of 30 fps for the pressure, acceleration, and motion capture sensors. Obtained data was saved separately as experts’ data and novices’ data.

6.1 Analysis Method

After adjusting data, we visualized and analyzed them. Then, we formed a hypothesis on the basis of the analysis and conducted detailed analysis.

6.2 Adjustment of Data

Although the actions of the players were recorded with the sensors, part of data was missing. In a case where occlusion occurs for example, the motion capture sensor cannot record joints. In addition, the sampling rate showed the unstable values: 26 Hz ± 4 Hz due to various causes.

Therefore, we made up for the missing values and converted data so that the sampling rate became 30 Hz (30 data per second).

  1. 1.

    Calculating the difference between data “n” and “n + 1” (n ≥ 0) with respect to each pair

  2. 2.

    Finding a pair of “n” and “n + 1” having the largest difference

  3. 3.

    Inserting an average between the found data “n” and “n + 1”

  4. 4.

    Repeating 1 to 3 above until the number of data per second reaches 30

We performed this procedure so that the number of data became 30 in every second.

7 Result

7.1 Visualization of Data

On the basis of the adjusted data, we performed 3-D plotting of 3-D coordinates of the joins. In the Fig. 1, the X, Y, and Z axes represent right and left, height, and depth, respectively. Besides, depth was visualized so that the far side and near side were expressed in red and blue, respectively. Although the pre-survey appeared to show that skills could be extracted from postures, the figure does not show significant characteristics.

Fig. 1.
figure 1

3-D plotting of the joints (Left: Expert, Right: Novice) (Color figure online)

3-D plotting of the joints throughout the body showed the most significant differences in wrist movements. The graph of right wrist joints (Fig. 2) shows the following facts.

Fig. 2.
figure 2

3-D plotting of the right wrist joints (Left: Expert, Right: Novice)

  1. 1.

    Experts bend the wrist up and down at different speeds.

  2. 2.

    Experts’ wrist strokes depict an elliptical shape.

We examined each fact in order to see if these characteristics apply to part of experts or all the experts.

7.2 Characteristics of Wrist Joints

We here examine the difference between bending up and bending down movements of the wrists. “Wrist joint speeds,” “grip strengths,” and “Y-axis accelerations” were used in the examination. All of these values were normalized (maximum value = 1). The graph of wrist joint speeds (Fig. 3) shows that experts maintained high speeds. By contrast, novices lacked regularity.

Fig. 3.
figure 3

Right wrist joint speeds (Above: Expert, Below: Novice)

In order to examine in what case experts speed up, we overlaid grip strength data thereon. The Fig. 4 shows wrist joint speeds on which the grip strength data were overlaid. This graph shows that experts increased the grip strength when the wrist joint speed increased. The graph also shows that they tightly held the drumstick in a short period, and in other periods, held the drumstick loosely.

Fig. 4.
figure 4

wrist joint speeds and grip strengths (Above: Expert, Below: Novice) (Color figure online)

Lastly, in order to examine the relation between timing when the drumstick hits the drumhead, speed, and grip strength, we overlaid the graph of Y-axis accelerations thereon. This Fig. 5 shows that there were moments when the direction of acceleration changed regularly. This seems to be because the drumstick hit the drumhead and bounced. The graph showed that speed and grip strength were high when the drumstick hit the drumhead. Besides, the speed was maintained even after the drumstick hit the drumhead. This shows that experts used bouncing power to shift to the next posture.

Fig. 5.
figure 5

wrist joint speeds, grip strengths and Y-axis accelerations (Above: Expert, Below: Novice) (Color figure online)

From these findings, we can show that “capability of controlling power to swing the drumstick” is one of the skills of experts.

7.3 Examination of Trajectories of Wrist Joints

We adopted the method of least squares to examine if the trajectories of wrist joint movements had an elliptical shape. If it is possible to express the wrist trajectories with approximate ellipses, we can conclude that the wrist joints depicted elliptic trajectories. The Fig. 6 shows the result obtained by the method of least squares. Red indicates recorded wrist joint trajectories and blue indicates the approximate ellipses.

Fig. 6.
figure 6

Plotting of wrist trajectories and approximate ellipses (Above: Expert, Below: Novice) (Color figure online)

The Fig. 6 shows that although the experts’ wrist joint movements depict approximate ellipses, the novices had different movements from the approximate ellipses.

We further found approximate ellipses from all the subjects’ data. As a result, the wrist joint trajectories of 21 out of 22 experts were elliptic shapes. This shows that experts’ right wrist joints have elliptic movement.

7.4 Examination of Approximate Ellipses in Consideration of Body Dimensions

Although we found that the experts’ wrist joint trajectories were close to elliptic shapes, the sizes of them varied depending on experts. Under the assumption that the sizes of the elliptic shapes depend on the body dimensions of the experts, we estimated ellipses from their dimensions.

From arm lengths, we could find three regression expressions of the major axis, minor axis, and inclination at a multiple correlation of 0.7. This shows that body dimensions and the parameters of ellipses highly correlated with each other.

8 Conclusion

We established a system for recording actions of players for the purpose of extraction and handing down of Japanese drum playing skills. Recorded data allowed us to discover characteristics of the experts. Besides, we found that recording the positions of joints, grip strengths, and the accelerations of the drumstick allowed extraction of Japanese drum playing skills.

By comparing the experts’ movements with the novices’ ones, we also found skills common to the experts. Further, we could create a model of automated skill extraction from experts’ movements and body dimensions.

9 Future Work

Since the automated ellipsis extraction model has somewhat insufficient accuracy, it is necessary to improve it. Also in areas other than automated ellipsis extraction, at the same time, we consider adding other movement parameters such as holding timing and holding strength to automatically extract many other skills.