1 Introduction

We propose a method for estimating the frictional force between the surface of an object and a human finger using thermal video recorded by an infrared thermographic camera. The frictional force is estimated using machine learning, which accounts for the temperature of the surface changing because of the frictional heat present when a human strokes an object. Because the surface temperature of an object is measurable with a high spatial resolution using a non-contact method, such as an infrared thermographic camera, remotely estimating the force acting on various surfaces is possible.

Specialized equipment is typically used to measure physical quantities related to tactile sense. For example, a strain gauge sensor and durometer are used for force and hardness measurements, respectively. Recently, methods have been proposed for estimating tactile information remotely based on visible images. Previous studies [3, 4, 7, 8] have proposed a method for estimating the applied pressing force from images of fingernail using the change in fingernail color caused by the changing blood flow when a finger presses an object. Unlike conventional methods that use measuring instruments, these image-based methods only require cameras and are expected to be applied in various situations [6].

Previous studies have used thermal images instead of visible images to estimate tactile events. For example, Dunn et al. proposed a method of classifying the strength of pressure as strong or weak using a random forest based on thermal images [2]. Several studies have also attempted to extract tactile histories from thermal images [1, 5].

Infrared thermographic camera-based methods have the following advantages over general cameras: (1) Because the strength of the friction force appears as a change in temperature from friction heat, an infrared thermographic camera can measure physical quantities near the force. (2) Sequential information is useable as spatial data because the touch history remains as the temperature changes. Hence, although superficial changes are unobserved in a visible image, the touch history can be captured in a thermal image.

This study propose a new technique for measuring tactile information remotely in a non-contact manner, that is, a method that estimates frictional force regressively using machine learning and thermal images. To verify the suitability of this method for general use, we examined the accuracy within a simple environment. Unlike previous methods, our estimates frictional force as a continuous quantity, not as a classification of force strength. This method can improve the means of information input. Moreover, the proposed method has the potential to measure force where conventional tactile sensors are difficult to install. For example, if we use a contact-type sensor to measure the force between touching skin, the tactile experience may change. However, we can measure the force without interference using a non-contact method.

2 Proposed Method

2.1 Problem Definition

Our method estimates frictional force using a thermal video in which the temperature change of a surface is recorded when touched by humans. We use the system shown in Fig. 1 to measure the training and validation data for machine learning. Each material is fixed to a force sensor that measures the frictional force when humans stroke it. While the contacted object is undeformed, a force sensor placed underneath the object can measure the frictional force exerted on the target surface. The thermographic camera is set above the target such that the lens and surface are parallel. The subjects draw a straight line left-to-right on the surface while changing the applied force. A thermographic camera and force sensor then simultaneously measure the temperature of the surface and frictional force, respectively.

Fig. 1.
figure 1

An overview of the system. When subjects touch each material fixed on a force sensor, a thermographic camera and force sensor measure the temperature of the surface and frictional force, respectively.

2.2 Making Dataset

The measured thermal image sequence is used as input, and the force at the time as output; thus, creating a training dataset that associates the thermal image with force is necessary.

To begin, we preprocess each of the measured data. We crop thermal videos using a fixed window to retain only the area around the straight line traced by subjects. Because the spatial resolution of the thermographic camera we used is \(640\times 480\), the input thermal video was cut to the range of \(\pm 64\) pixels from the straight line; thus, \(640\times 128\) was used. Next, we smooth the force data to equalize the frame number of the force data and thermal video. The force sensor now measures at 120 fps, and the thermographic camera at 30 fps. Thus, the force sensor data is averaged every 4 frames.

We then extract the periods in which subjects touch objects. These are periods in which the vertical force is larger than the threshold of 0.1 N for longer than 6 frames (200 ms). We denote each period as \(I_n\) (see Fig. 2). In addition, each period \(I_n\) is divided every 6 frames (200 ms), and each is used as a segment \(S_i\). We generate input and output data from each segment, as shown in Fig. 3. The input data are the six differential images in each segment obtained by subtracting the thermal image of the previous frame, and the corresponding label data are defined as the average value of the frictional force in the segment. The use of the subtracted images between frames reduces the effect of variations in the initial temperature of the surface. We show some examples of input and label data in Fig. 4.

Fig. 2.
figure 2

An example of intervals \(\{I_n\}_n\). Each period is longer than 6 frames (200 ms), and the vertical force is continuously larger than 0.1 N.

Fig. 3.
figure 3

An overview of dataset generation. Label data are the average value of friction force in a segment, and the corresponding input data are a set of differential thermal images in a time-delayed segment.

When the frictional force is measured, the thermal image does not contain information beneath the finger. We assume that the tracing speed changes insignificantly, and, by setting a constant delay time, the thermal image and frictional force are made to correspond. That is, the thermal image information after 3 frames (100 ms) is used when the frictional force at frame t is estimated. This relies on the thermal image containing the contact history, and the frictional force before 3 frames (100 ms) can be estimated from the image after the finger moves.

2.3 Machine Learning Model

We use an efficient convolutional network model (ECO: the extended model of a basic convolutional neural network (CNN) model [9]) to estimate the frictional force from thermal images. Because the original ECO model is used for the classification problem, we modify the output layer for the regression problem. That is, in classification tasks, the probability of each class is output in the last layer; thus, we output a single scalar quantity instead using the affine layer. Moreover, because the size is different from that of the input image used in the original ECO model, we change the kernel size of the convolution layer near the input layer.

Fig. 4.
figure 4

These figures show examples of the input and label data used for learning: (a) small force and (b) large force. The input images represent a heat map of the line traced by the subject. Because the time resolution of the thermographic camera is not exactly 30 fps, some frames without change are green. (Color figure online)

3 Experiment

Data were obtained from six subjects (three men and three women). We experimented with the system illustrated in Fig. 1 and adopted two materials: polystyrene foam and cardboard. We used a tactile force plate (Tech-Gihan, TF-2020) and infrared thermographic camera (Testo, testo 883) to measure the frictional force and temperature of the object surface, respectively. The force plate had an A/D converter with a 16 bit resolution that can measure up to 10 N. In addition, the camera resolution was \(640 \times 480\) px. We set the thermographic camera 30 cm above the target surface. To prevent the average surface temperature from increasing when the subjects trace the surface, a fan blows air such that the surface temperature remains constant while uncontacted.

We synchronize the time between the thermal video captured by the thermographic camera and the force data measured by the tactile force plate using the following method. First, we placed a finger on an object and quickly swiped in a horizontal direction by momentarily adding force at the stationary state. Because the time resolution of the thermographic camera was 30 fps at most, a frame exists in which the position of the finger changed discontinuously. In contrast, the tactile force plate detected a horizontal impulse input. Hence, we treat these as the same time data and as synchronized. Because the tactile force plate measured forces at 120 fps, our dataset included time synchronization errors of up to 33 ms.

Subjects traced a 20 cm straight line left-to-right on each material while changing the applied pressing force, and we measured the frictional force and temperature change. Although the subjects were not required to use a precise stroke speed, they attempted to stroke each trace in about a second. All subjects traced each piece of material for 10 min, and the sensor output was recorded. We used 80% of the data obtained for training and 20% for testing. Three datasets of patterns were used for training and testing: (a) polystyrene foam and cardboard, (b) polystyrene foam, and (c) cardboard. The data was obtained only for patterns (b) and (c), whereas we mixed and used pattern (a).

In addition, we conducted another experiment to determine the effect of direct heat transfer between the finger and object. When humans touch, two factors change the temperature: frictional heat and heat transfer. However, the temperature change owing to heat transfer is primarily caused by finger contact time, which is independent of frictional force. Thus, we conducted the following experiment to determine if the accuracy improved by removing the heat transfer effect. Subjects wore a glove made of cotton and polyester to touch the object instead of touching it with their fingers. Wearing a glove creates a layer that insulates and reduces the effect of heat transfer between the finger and object. We performed the same experiment as without the glove and compared the results of both experiments.

4 Results and Discussion

The estimation results for each pattern in the skin and glove experiments are shown in Fig. 5 for all patterns. The horizontal axis shows the actual value of the frictional force, and the vertical axis the estimated value. This means that a point closer to the 45\(^{\circ }\) diagonal line can be estimated with high accuracy. Table 1 lists the root mean square error (RMSE).

Compared to the results without gloves, the accuracy improved when wearing gloves for all patterns except (c), in which the accuracy deteriorates Therefore, this improvement may be owed to the removed heat transfer when wearing gloves. Possible reasons for accuracy deteriorating in (c) is the small friction coefficient between the cardboard and glove and the temperature change owing to frictional heat possibly being insufficiently large. Increasing the length of the segment may solve this. Because the heat transfer is slower than finger motions, further improvements in accuracy are expected by extending the segment length and observing the heat transfer more carefully.

In our experiment, the method is limited to a fixed path, but the same method may be applied to a path of arbitrary shape. For example, if we track the position of a finger and crop a thermal image around it, the subsequent procedure is the same as that of the present method. Furthermore, by modifying the model to simultaneously estimate the dynamic friction coefficient, the frictional and normal forces can be estimated, and the spatial pressure distribution is obtainable.

In both experiments, the results of pattern (a) are as accurate as those of patterns (b) and (c), which used only one material. This result suggests that the proposed method has a generalization performance with respect to tactile objects.

Points with a lower estimation accuracy farther from the straight line, are primarily distributed below the line, which means that they are estimated to be smaller than the actual values. This result is attributable to the biased distribution of the label data used for training. More data for smaller forces than larger ones exist in any given dataset; therefore, the model is likelier to estimate smaller forces than the actual values. To address this problem, we may improve our method in various ways, such as performing data augmentation to unify the distribution of data and weighing a loss function to equalize the influence from the data.

Fig. 5.
figure 5

Estimation results of the frictional force estimated from thermal images using machine learning: (a) polystyrene foam and cardboard, (b) polystyrene foam, and (c) cardboard.

Table 1. RMSEs of each pattern in the skin and glove experiments.

5 Conclusion

This paper proposes a method for estimating the frictional force between a finger and object using the temperature change of the surface when humans stroke it. As a result of the experiment using the data when 6 subjects touch two types of materials, we succeeded in estimating with an error of less than 10% of the data range. Because the number of materials used for the experiment was insufficient for general applications; thus, more objects with varying heat capacities must be tested. However, as a basic test of our proposed method for estimating the frictional force using thermal images, we showed this possibility.

In addition, we confirmed that the estimation accuracy improved in many cases by reducing the effect of direct heat transfer between the finger and object.

These results show the possibility of inputting information through finer force changes, contrasting the classification of force strength demonstrated by a previous study. In addition, we showed the possibility of measuring tactile information in locations where installing tactile sensors is challenging.