A force levels and gestures integrated multi-task strategy for neural decoding

Hua, Shaoyang; Wang, Congqing; Xie, Zuoshu; Wu, Xuewei

doi:10.1007/s40747-020-00140-9

A force levels and gestures integrated multi-task strategy for neural decoding

Original Article
Open access
Published: 06 May 2020

Volume 6, pages 469–478, (2020)
Cite this article

Download PDF

You have full access to this open access article

Complex & Intelligent Systems Aims and scope Submit manuscript

A force levels and gestures integrated multi-task strategy for neural decoding

Download PDF

Shaoyang Hua¹,
Congqing Wang¹,
Zuoshu Xie¹ &
…
Xuewei Wu¹

1443 Accesses
10 Citations
Explore all metrics

Abstract

This paper discusses the problem of decoding gestures represented by surface electromyography (sEMG) signals in the presence of variable force levels. It is an attempt that multi-task learning (MTL) is proposed to recognize gestures and force levels synchronously. First, methods of gesture recognition with different force levels are investigated. Then, MTL framework is presented to improve the gesture recognition performance and give information about force levels. Last but not least, to solve the problem that using the greedy principle in MTL, a modified pseudo-task augmentation (PTA) trajectory is introduced. Experiments conducted on two representative datasets demonstrate that compared with other methods, frequency domain information with convolutional neural network (CNN) is more suitable for gesture recognition with variable force levels. Besides, the feasibility of extracting features that are closely related to both gestures and force levels is verified via MTL. By influencing learning dynamics, the proposed PTA method can improve the results of all tasks, and make it applicable to the case where the main tasks and auxiliary tasks are clear.

Electromyographic Signal Based Dynamic Hand Gesture Recognition Using Transfer Learning

Improving Gesture Recognition by Bidirectional Temporal Convolutional Netwoks

STCN-GR: Spatial-Temporal Convolutional Networks for Surface-Electromyography-Based Gesture Recognition

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Neural decoding based on electromyography (EMG) signals has attracted many researchers to explore [1]. It is a technology translating bioelectrical signals in muscles into corresponding instructions [2]. Compared with other human–computer interaction modes, neural decoding is more convenient and less constrained by the surrounding environment, resulting in tremendous development potential in medical, entertainment and military fields [3, 4].

There has been a lot of literature about neural decoding of gestures. Naik et al. [5] associated independent component analysis with Icasso clustering to extract features of surface electromyography (sEMG), then classified gestures by linear discriminant analysis (LDA). Lima et al. investigated relevance vector machines and fractal dimension to identify seven gestures [6]. Besides, convolutional neural network (CNN) has benefited from the success in the computer vision field, and is applied in neural decoding [7,8,9,10,11]. Wei et al. [12] combined information detected from electrodes in different methods, and input them to a multi-stream CNN framework. Hu et al. [13] considered time-series information by recurrent neural network based on this work, which improved the recognition accuracy. In addition, Allard et al. [14] and Zhai et al. [15] proposed a novel method by calculating the feature matrices from time–frequency domain information and classified gestures with CNN models.

While all these methods usually recognize gestures under a fixed force level. The combination of data preprocessing and classifiers is not discussed in the various force levels situation, which means the neglect of strength information. Considering this factor, force myography was selected to recognize sixteen gestures at three force levels from nine subjects [16]. Different from the force sensing resistor (FSR) signals, sEMG signals cannot present information related to force directly. Jiang et al. [17] upgraded hardware with inertial measurement units (IMU) and sEMG sensors, analyzed information from both surface gestures and air gestures. But the force levels considered were only medium and low, the effect of high force level was not investigated. Besides, EMG signals affected by different force levels may reduce the performance of the gestures recognition task. To eliminate the force influence, Al-Timemy et al. [18] adopted an energy-based feature set in the circumstance of six gestures at three force levels. Although this method obtained good results, it ignored the usefulness of force information in real life.

Intuitively, gestures recognition is always not an isolated problem. When a gesture is performed, the subject would provide different force levels according to the needs of the environment. As discussed in [19,20,21], these levels can also be decoded by analyzing the EMG signals. This naturally motivates us to explore the multi-task learning (MTL) framework to decode gestures and force levels in sEMG signals [22].

MTL is a method aiming to learn multiple related tasks simultaneously [23]. By sharing features representation between tasks, it makes models generalize better than learning independently one task [24]. Besides, the introduced inductive bias also leads to a risk reduction of over-fitting. However, most prior jobs either treat the importance of tasks equally or search them by greedy search, which usually cannot find the optimal parameters of models [25]. For conquering this problem, pseudo-task augmentation (PTA) is employed for influencing learning dynamics. As complementary, it is validated to result in performance gains for both single-task learning (STL) and MTL [26].

The main contributions of this paper can be summarized as follows:

1.
Different from datasets that collect various gestures with constant force, a dataset containing eight gestures at three force levels is provided. On this basis, matches of data preprocessing and classifiers are compared to find appreciate methods for gestures recognition.
2.
Different from a single task of gestures recognition or force estimation as most works do, neural decoding in this job is formulated as an MTL problem. The feasibility of whether gestures and force levels can be decoded synchronously from sEMG signals is explored, which boosts the performance of gestures recognition tasks and gives additional force information.
3.
To modify the equal importance of tasks, a PTA strategy is followed with interest. Different from the method proposed in [26], a PTA strategy with weight coefficients is introduced, which considers the relationship between tasks and demonstrates efficient performance in experimental results.

The remaining parts of the paper are summarized as follows. Datasets and materials are discussed in Sect. 2. Proposed methods of multi-task CNN models associated with PTA are presented in Sect. 3. Experiments and results on two datasets are demonstrated in Sect. 4. Finally, the conclusions of the paper are summarized in Sect. 5.

Dataset and materials

In this paper, two datasets are used in experiments. One represents amputees and the other represents subjects with healthy limbs.

Amputees dataset

In [18], nine amputees conducting six gestures with three force levels participated in the experiment. sEMG signals were sampled at a frequency of 2000 Hz. The gestures are separate thumb flexion, index flexion, fine pinch, tripod grip, hook grip and spherical grip. Each gesture was performed 5–9 trials and each trial lasts 2.5–20 s according to the amputees. The force levels are represented by high, median, and low. Following the protocol, the first eight EMG electrode channels are utilized in experiments. To solve sample imbalance, this paper first sorts the trials in each gesture according to the file size and then selects the first five largest ones for experiments. The first, third and fourth are used as the training set, the second and fifth are used as validation and testing set.

Healthy subjects dataset

For experiments of subjects with healthy limbs, a wearable device is developed to collect the sEMG signals (see Fig. 1). The device consists of 16 acquisition modules. Each module contains a pair of electrodes with a vertical distance of 10 mm. All the 16 sEMG signals were amplified with gain of 960 and filtered with a bandwidth of 20–500 Hz. Besides, signals were sampled at 1000 Hz with an analog-to-digital converter, resulting in an 8-bit digital signal in each module. For comparison, signals from acquisition modules with odd number indexes are chosen in experiments.

As for the collection period, seven subjects aged from 25 to 30 volunteered, including two females and five males. During collection, a force sensor was first used to test the maximum voluntary contraction (MVC) of each subject. Then, subjects were asked to perform eight gestures with three force levels, representing 20%, 40% and 60% of MVC, respectively. These gestures are separately palm press, thumb press, three-finger grasp, grasp, pinch, fist press and key pinch (see Fig. 2). All of them are selected from the commonly used gestures with strength, such as grasp or contact with a surface. For each gesture, five trials were collected and each trial continued for five seconds. Considering muscle fatigue, there was a rest for several seconds between every two trials. sEMG signals detected from the selected eight channels are shown in Fig. 3.

MTL and PTA strategies

Data preprocessing and CNN models

Given the raw sEMG signals, we divide them into small segments by the sliding window method. Considering the large amount of dataset required by the CNN model, overlapped windowing scheme is utilized. Drawing on the past experiments, window length should be shorter than 300 ms to satisfy subjects imperceptible in real-life applications [27, 28]. In this work, it is set to 200 ms for both datasets, with an overlapped window size of 140 ms.

Before inputting to CNN, segmented data in each electrode was transformed into frequency information by Fast Fourier Transform. Considering the majority of sEMG energy ranged from 0 to 500 Hz, for the dataset of amputees, the first 100 spectrums were used as input [14, 29]. And the first frequency band was removed to reduce baseline drift and motion artifact. With respect to the dataset of healthy subjects, the first five frequency bands were removed due to filters of the hardware.

To make full use of information in electrode channels, a multi-stream CNN model is designed as seen in Fig. 4. In this way, spectrums from each electrode are used as the input of each stream. And there are three blocks in the stream, which are batch normalization (BN) layers, convolutional layers, and max-pooling layers. The convolutional layer has 32 kernels sized 1*3, and the max-pooling layer is sized 1*2 with a stride of 2. By stacking blocks, CNN model exacts features through a hierarchy of spectrum abstractions. All the streams will converge into fully connected (FC) layers for classification. There are three FC layers in the model, and the nodes of the first two are separately 512 and 256, with a dropout probability of 0.5. While the last FC layer consisted of two parts connecting to the second FC layer independently, namely the number of gestures and force levels.

MTL framework

As described before, two tasks are finished by the last FC layer of the model simultaneously, which means they share the same features as most MTL frameworks do. The specific information of MTL framework is introduced below.

Given a training set with $T$ samples $D = \{ s_{i} ,{\kern 1pt} {\mathbf{y}}_{i} \}_{i = 1}^{T}$, where $s_{i}$ is the $i$th signal segment, ${\mathbf{y}}_{i}$ are the corresponding labels made up of gesture label (${\mathbf{y}}_{i}^{g}$) and force level label (${\mathbf{y}}_{i}^{f}$). For clarity, the index $i$ is eliminated, then the shared feature vector ${\mathbf{x}} \in {\mathbb{R}}^{C \times 1}$ of the last max-pooling layer can be formulated as:

$$ {\mathbf{x}} = f(s{\kern 1pt} ;{\kern 1pt} {\kern 1pt} {\mathbf{k}}_{c} ,{\kern 1pt} {\kern 1pt} {\mathbf{b}}_{c} ,{\kern 1pt} {\kern 1pt} {{\varvec{\upgamma}}},{\kern 1pt} {\kern 1pt} {{\varvec{\upbeta}}}) $$

(1)

where $f$ donates the non-linear function from the input signals to features. ${\mathbf{k}}_{c}$ and ${\mathbf{b}}_{c}$ are the parameters of kernels and bias vectors of convolutional layers. ${{\varvec{\upgamma}}}$ and ${{\varvec{\upbeta}}}$ are the set of scales and shifts in the BN layers.

After feature representation from the last pooling layer, three FC layers are employed for classification. Suppose ${\mathbf{W}}_{i} \in {\mathbb{R}}^{{D_{i - 1} \times D_{i} }}$ and ${\mathbf{b}}_{i} \in {\mathbb{R}}^{{D_{i} \times 1}}$ are the weight matrices and bias vectors of the $i$th FC layer with output number of $D_{i}$ ($D_{0} = C$), then the prediction score of the $i$th FC layer ${\mathbf{y}}_{i}^{p}$ is as follows:

$$ {\mathbf{y}}_{i}^{p} = {\mathbf{W}}_{i}^{T} {\mathbf{y}}_{i - 1}^{p} + {\mathbf{b}}_{i} $$

(2)

where $i = 1,2,3$, ${\mathbf{y}}_{{_{0} }}^{p} = {\mathbf{x}}$. Specifically, the third FC layer contains both gestures and force levels. Let ${\mathbf{W}}_{{3{\text{g}}}}$, ${\mathbf{W}}_{{3{\text{f}}}}$,${\mathbf{b}}_{{3{\text{g}}}}$, ${\mathbf{b}}_{{3{\text{f}}}}$ donate the weight matrices and bias vectors of the two-part in the last layer, the outputs ${\mathbf{y}}^{{{\text{pg}}}}$ and ${\mathbf{y}}^{{{\text{pf}}}}$ can be represented as:

$$ {\mathbf{y}}^{{{\text{pg}}}} = {\mathbf{W}}_{{3{\text{g}}}}^{T} {\mathbf{x}} + {\mathbf{b}}_{{3{\text{g}}}} $$

(3)

$$ {\mathbf{y}}^{{{\text{pf}}}} = {\mathbf{W}}_{{3{\text{f}}}}^{T} {\mathbf{x}} + {\mathbf{b}}_{{3{\text{f}}}} $$

(4)

The probabilities of ${\mathbf{x}}$ belonging to gestures (${\hat{\mathbf{y}}}^{{{\text{pg}}}}$) and force levels (${\hat{\mathbf{y}}}^{{{\text{pf}}}}$) are calculated by feeding ${\mathbf{y}}$ into a softmax function.

$$ {\text{softmax}}({\mathbf{y}}^{{{\text{pg}}}} )_{m} = p(\hat{y}^{{{\text{pg}}}} = m|{\mathbf{x}}) = \frac{{\exp (y_{m}^{{{\text{pg}}}} )}}{{\sum\nolimits_{i} {\exp (y_{i}^{{{\text{pg}}}} )} }}, $$

(5)

$$ {\text{softmax}}({\mathbf{y}}^{{{\text{pf}}}} )_{n} = p(\hat{y}^{{{\text{pf}}}} = n|{\mathbf{x}}) = \frac{{\exp (y_{n}^{{{\text{pf}}}} )}}{{\sum\nolimits_{j} {\exp (y_{j}^{{{\text{pf}}}} )} }}, $$

(6)

where $y_{i}^{{{\text{pg}}}}$ and $y_{j}^{{{\text{pf}}}}$ are the $i$th element in ${\mathbf{y}}^{{{\text{pg}}}}$ and the $j$th element in ${\mathbf{y}}^{{{\text{pf}}}}$. The softmax function converts the output ${\mathbf{y}}^{{{\text{pg}}}}$ and ${\mathbf{y}}^{{{\text{pf}}}}$ into a probability distribution over respective labels. Finally, the predicted gesture $\hat{y}^{{{\text{pg}}}}$ and force level $\hat{y}^{{{\text{pf}}}}$ are obtained via:

$$ \hat{y}^{{{\text{pg}}}} = \mathop {\text{argmax}}\limits_{m} \;{\text{softmax}}({\mathbf{y}}^{{{\text{pg}}}} )_{m} , $$

(7)

$$ \hat{y}^{{{\text{pf}}}} = \mathop {\text{argmax}}\limits_{n} \;{\text{softmax}}({\mathbf{y}}^{{{\text{pf}}}} )_{n} , $$

(8)

The cross-entropy losses are employed:

$$ L_{{\text{g}}} = - \sum\limits_{m = 1}^{M} {y_{m}^{g} \log (p(\hat{y}_{m}^{{{\text{pg}}}} = y_{m}^{{\text{g}}} )|{\mathbf{x}},{\mathbf{W}}_{1} ,{\mathbf{b}}_{1} ,{\mathbf{W}}_{2} ,{\mathbf{b}}_{2} ,{\mathbf{W}}_{{3{\text{g}}}} ,{\mathbf{b}}_{{3{\text{g}}}} )} $$

(9)

$$ L_{{\text{f}}} = - \sum\limits_{n = 1}^{N} {y_{n}^{{\text{f}}} \log (p(\hat{y}_{n}^{{{\text{pf}}}} = y_{n}^{{\text{f}}} )|{\mathbf{x}},{\mathbf{W}}_{1} ,{\mathbf{b}}_{1} ,{\mathbf{W}}_{2} ,{\mathbf{b}}_{2} ,{\mathbf{W}}_{{3{\text{f}}}} ,{\mathbf{b}}_{{3{\text{f}}}} )} $$

(10)

where $M$ and $N$ are the number of gestures and the number of force levels, respectively. Let the parameters of the whole model donate as $\Theta$, compared with STL, the loss function consisted of two parts is as follows.

$$ \mathop {\min }\limits_{\Theta } (L_{{\text{g}}} + \alpha L_{{\text{f}}} ) $$

(11)

where $\alpha$ represents the importance of the auxiliary task. In real life, the force levels prediction will be valuable on condition that gestures are recognized correctly, so $\alpha$ ranges from 0 to 1.

PTA strategy

PTA strategy adopts the idea from MTL that training related tasks drawn from the same feature space [30]. If the last layer of the MTL framework is seen as a decoder for each task, the PTA mean numbers of distinct decoders are made within the task. It has been proven that the PTA strategy has a fundamental effect on learning dynamics, which leads to further improvements in both STL and MTL. For MTL with $T$ tasks and $D$ decoders in each, the learning problem of PTA strategy can be expressed as follows:

$$ \Theta^{*} = \mathop {{\text{argmin}}}\limits_{\Theta } \frac{1}{{{\text{TD}}}}\sum\limits_{t = 1}^{T} {\sum\limits_{d = 1}^{D} {L(y^{t} ,\hat{y}^{{{\text{td}}}} )} } $$

(12)

where $y^{t}$ is the true label of the $t$th task, and $\hat{y}^{{{\text{td}}}}$ is the predicted score of the $d$th decoder in the $t$th task.

However, this method gives equal importance to all tasks by default, which is usually unreasonable in MTL. For example, only when the gestures are predicted correctly, can the prediction of force levels be meaningful. Therefore, an approximate range of weight is first determined through grid search, then learning dynamics is further affected by PTA strategy. The modified PTA strategy during the training period is conducted as follows:

$$ \Theta_{{{\text{tr}}}} = \mathop {{\text{argmin}}}\limits_{\Theta } \frac{1}{D}\sum\limits_{d = 1}^{D} {\{ L(y^{{\text{g}}} ,\hat{y}^{{{\text{gd}}}} )} + \alpha \, \cdot \,L(y^{{\text{f}}} ,\hat{y}^{{{\text{fd}}}} )\} $$

(13)

where $\hat{y}^{{{\text{gd}}}}$ and $\hat{y}^{{{\text{fd}}}}$ are prediction scores of the $d$th decoder for the gestures and force levels, respectively. For validation, the best performing decoder for each task is selected as follows:

$$ \Theta_{{{\text{eval}}}} = \mathop {{\text{argmin}}}\limits_{\Theta } \{ L(y^{{\text{g}}} ,\hat{y}^{{{\text{g}}d_{1} }} ) + \alpha L(y^{{\text{f}}} ,\hat{y}^{{{\text{f}}d_{2} }} )\} $$

(14)

where $d_{1} ,d_{2} \in [1,D]$. Because the two parts of the loss function are independent, the weight parameter $\alpha$ does not influence the final result. So it is ignored during the experiments. To distinguish it from the original PTA strategy, the algorithm in this paper is expressed as a weighted PTA (WPTA). The specific implementation of WPTA is shown in Algorithm 1.

In this way, the parameters of each decoder are initialized independently to ensure the learning dynamics work. Furthermore, by updating $\Theta$ while freezing parameters of decoders except the first one of all tasks every iteration, the optimal model can still be learned for each task. Finally, for validation and testing progress, the best performing decoder of each task is selected, which contributes to improving computational efficiency.

Experiments and analysis

In this section, the proposed methods are evaluated with a dataset of amputees and healthy subjects, respectively. With each dataset, a range of settings are conducted as follows: (1) gestures recognition with different methods; (2) multi-task recognition for gestures and force levels; (3) MTL with WPTA strategy. All these mentioned experiments are conducted three times and the average performance is shown below.

Recognition results of amputees

Gestures recognition with different methods

Gestures recognition under different force levels is an important topic. Combining with previous experience, data processing and algorithm are firstly discussed in this paper. To verify the rationality of the proposed method, methods in [12, 17] are used for comparison. The former uses downsampling and low-pass Butterworth filter for amplitude estimation and then proposes a multi-stream CNN based on the processed data to classify sEMG signals, which has a strong contrast with the proposed method. The latter solves similar tasks of this paper. It first extracts features including mean absolute value, zero crossing, slope sign changes and waveform length from the raw signal, and then adopts LDA to recognize gestures with two force levels. For the dataset of amputees, the results of nine patients using three methods separately are shown in Fig. 5.

It shows that the CNN model based on frequency domain information makes the best result (about 5.57% higher than CNN with time-domain information, and 2.14% higher than the traditional method). The phenomenon means that compared with the time domain information used in [12], frequency domain information is more sensitive to the variance of force. And different from the traditional method, the CNN model can extract implicit information in the data more effectively, which verifies the effectiveness of the proposed method.

Multi-task recognition for gestures and force levels

On the basis of multi-stream CNN with spectrums as inputs, an MTL framework is employed as described in Sect. 3. Considering the particularity of the task, gestures recognition task is more important than force levels recognition task, the weight coefficient is searched by setting $\alpha$ from 0.0 to 1.0, with an interval of 0.1. Results of both gestures and force levels prediction are shown in Tables 1 and 2.

Table 1 Gesture recognition results in different weight coefficients (%)

Full size table

Table 2 Force levels recognition results in different weight coefficients (%)

Full size table

Notably, $\alpha$ is 0.0 means that a single task is performed without force levels prediction. So there is no result. From Table 1, it is seen that compared with the single gestures recognition, MTL can effectively improve the accuracy of gestures. This suggests that the gestures and force levels share a uniform feature space, which verifies the feasibility that both tasks can be classified within a MTL framework. Specifically, features learned from force levels recognition task maybe helpful for gestures recognition task, and vice versa. Thus both tasks can have a preferable performance with MTL. Besides, this method provides additional information about force and makes it more practical in real life. On this basis, MTL using a grid search can improve the accuracy of gestures recognition task, which further proves that there is a correlation between the two tasks, and the variable force levels affect the gestures recognition.

Multi-task recognition with WPTA method

Table 1 shows that when $\alpha$ is equal to 0.3–0.6, the gestures recognition accuracy performs best. So the weight coefficients in Eqs. 13 and 14 are set to 0.3, 0.4, 0.5 and 0.6, separately. For comparing the method in [25], training with Eq. 12 is also performed, which donates as No on $\alpha$ axis in Fig. 6.

The green plane donates the best results obtained in MTL methods with grid search ($\alpha$ is 0.3, number of decoders is 1). It is seen that with MTL and WPTA, improvements are achieved in both tasks, demonstrating the superiority of the proposed method. This is reasoned by the WPTA characteristics which change the learning dynamics adaptively, which can be regarded as a process of fine-tuning importance of auxiliary tasks. So it improves the disadvantage of fixed weight coefficient among all subjects in a grid search. By increasing the number of decoders, the results get better, which draws the same conclusion with [26]. Theoretically, the original PTA method may achieve the same results when there are enough decoders. However, it is a waste of computing resources of hardware.

Recognition results of healthy subjects

As for the dataset of healthy subjects, following the above conduct, we have successively carried out experiments on different settings. For further analysis, experiments on segmented data with length of 150 ms are also carried out. The best performance of gestures prediction and corresponding force levels prediction scores are shown in Table 3.

Table 3 Results of healthy subjects (%)

Full size table

For MTL and MTL with WPTA in Table 3, the $\alpha$ is equal to 0.3 in the data segment length of 150 ms and 0.4 in 200 ms. It is seen that the algorithm proposed still has certain advantages compared with other methods. As the spectrums in 200 ms contain detailed information in the frequency domain, it shows better performance compared with 150 ms.

In particular, a comparison between STL and MTL with WPTA for each subject based on frequency information is demonstrated in Table 4. It is seen that for most subjects, the proposed method makes performable results, no matter what the time window size is. This is the reason that with MTL, features learned from force levels recognition task improve the main task performance. By introducing an additional loss, MTL also acts as a regularizer preventing over-fitting. Besides, via WPTA, the disadvantage of equal tasks importance is overcome, and a reasonable importance weights distribution is obtained. Rather than one decoder of each task, WPTA provides choices by more decoders, thus further improving the performance. In addition, due to the different physiologic states according to the subjects, the comparison also differs. In detail, stronger subjects can better resist to muscle fatigue during collection, resulting in a more stable EMG signal at each force level. While for others, EMG signals will be affected by muscle fatigue, and the proposed method cannot make much improvement in this situation.

Table 4 Comparison of STL and MTL with WPTA (%)

Full size table

In addition, compared with the experimental results of amputees, results of healthy subjects perform better although there are more gestures under the same electrode channels condition, and accuracy differences among methods are smaller than that of amputees. This is reasoned that without damage to the forearm, sEMG signals of healthy subjects are more regular.

Conclusions and future work

Gestures recognition based on sEMG signals has been greatly investigated in recent years. Compared with existing methods, neural decoding under variable force levels situation is still a difficult problem. This paper firstly explores the combinations of data preprocessing and classifiers, proving that frequency domain information is more sensitive to strength. Considering the importance of force information in real life, MTL framework is leveraged to decode the gestures and force levels simultaneously. Experimental results validate the efficiency of MTL. Considering that grid search is so rough that may lead to local optimal solutions, PTA strategy is proposed here. However, it cannot improve the performance because all tasks are assigned as the same weight by default. By combining the above two methods, WPTA technology is applied, which boosts the performance of all tasks in MTL. It is worth noting that similar optimization forms are not only suitable for sEMG signals decoding, but also for any other MTL tasks.

In the future work, we will continue to study the adaptive weight method, and replace the grid-search with it.

References

Arjunan SP, Kumar DK (2010) Decoding subtle forearm flexions using fractal features of surface electromyogram from single and multiple sensors. J Neuroeng Rehabil 7:53. https://doi.org/10.1186/1743-0003-7-53
Article Google Scholar
Wang S, Zhang J, Wang H, Lin N, Zong C (2020) Fine-grained neural decoding with distributed word representations. Inf Sci 507:256–272
Article MathSciNet Google Scholar
Li X, Samuel OW, Zhang X, Wang H, Fang P, Li G (2017) A motion-classification strategy based on sEMG-EEG signal combination for upper-limb amputees. J Neuroeng Rehabil 14:2. https://doi.org/10.1186/s12984-016-0212-z
Article Google Scholar
Zhuojun X, Yantao T, Yang L (2015) sEMG pattern recognition of muscle force of upper arm for intelligent bionic limb control. J Bionic Eng 12(2):316–323
Article Google Scholar
Naik GR, Al-Timemy AH, Nguyen HT (2015) Transradial amputee gesture classification using an optimal number of sEMG sensors: an approach using ICA clustering. IEEE Trans Neural Syst Rehabil Eng 24(8):837–846
Article Google Scholar
Lima CA, Coelho AL, Madeo RC, Peres SM (2016) Classification of electromyography signals using relevance vector machines and fractal dimension. Neural Comput Appl 27(3):791–804
Article Google Scholar
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Boston, MA, pp 1–9
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks, European conference on computer vision, pp 818–833
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Las Vegas, NV, pp 770–778
Kollias D, Tagaris A, Stafylopatis A, Kollias S, Tagaris G (2018) Deep neural architectures for prediction in healthcare. Complex Intell Syst 4(2):119–131
Article Google Scholar
Wei W, Wong Y, Du Y, Hu Y, Kankanhalli M, Geng W (2017) A multi-stream convolutional neural network for sEMG-based gesture recognition in muscle-computer interface. Pattern Recogn Lett 119(1):131–138
Google Scholar
Hu Y, Wong Y, Wei W, Du Y, Kankanhalli M, Geng W (2018) A novel attention-based hybrid CNN-RNN architecture for sEMG-based gesture recognition. PLoS ONE. https://doi.org/10.1371/journal.pone.0206049
Article Google Scholar
Cote-Allard U, Fall CL, Drouin A, Campeau-Lecours A, Gosselin C, Glette K, Laviolette F, Gosselin B (2019) Deep learning for electromyographic hand gesture signal classification using transfer learning. IEEE Trans Neural Syst Rehabil Eng 27(4):760–771
Article Google Scholar
Zhai X, Jelfs B, Chan RH, Tin C (2017) Self-recalibrating surface EMG pattern recognition for neuroprosthesis control based on convolutional neural network. Front Neurosci. https://doi.org/10.3389/fnins.2017.00379
Article Google Scholar
Jiang X, Merhi LK, Menon C (2017) Force exertion affects grasp classification using force myography. IEEE Trans Human Mach Syst 48(2):219–226
Article Google Scholar
Jiang S, Lv B, Guo W, Zhang C, Wang H, Sheng X, Shull PB (2017) Feasibility of wrist-worn, real-time hand, and surface gesture recognition via sEMG and IMU sensing. IEEE Trans Ind Inf 14(8):3376–3385
Article Google Scholar
Al-Timemy AH, Khushaba RN, Bugmann G, Escudero J (2015) Improving the performance against force variation of EMG controlled multifunctional upper-limb prostheses for transradial amputees. IEEE Trans Neural Syst Rehabil Eng 24(6):650–661
Article Google Scholar
Kim S, Kim J, Kim M, Kim S, Park J (2019) Grasping force estimation by sEMG signals and arm posture: tensor decomposition approach. J Bionic Eng 16(3):455–467
Article Google Scholar
Luo J, Liu C, Yang C (2019) Estimation of EMG-based force using a neural-network-based approach. IEEE Access 7:64856–64865
Article Google Scholar
Zhang S, Guo S, Gao B, Huang Q, Pang M, Hirata H, Ishihara H (2016) Muscle strength assessment system using sEMG-based force prediction method for wrist joint. J Med Biol Eng 36(1):121–131
Article Google Scholar
Yin X, Liu X (2017) Multi-task convolutional neural network for pose-invariant face recognition. IEEE Trans Image Process 27(2):964–975
Article MathSciNet MATH Google Scholar
Ouyang X, Xu S, Zhang C, Zhou P, Yang Y, Liu G, Li X (2019) A 3D-CNN and LSTM based multi-task learning architecture for action recognition. IEEE Access 7:40757–40770
Article Google Scholar
Ruder S (2017) An overview of multi-task learning in deep neural networks. arXiv:1706.05098
Pang C, Liu H, Li X (2019) Multitask learning of time-frequency CNN for sound source localization. IEEE Access 7:40725–40737
Article Google Scholar
Meyerson E, Miikkulainen R (2018) Pseudo-task augmentation: from deep multitask learning to intratask sharing—and back. arXiv:1803.04062
Englehart K, Hudgins B (2003) A robust real-time control scheme for multifunction myoelectric control. IEEE Trans Biomed Eng 50(7):848–854
Article Google Scholar
Farrell TR, Weir RF (2007) The optimal controller delay for myoelectric prostheses. IEEE Trans Neural Syst Rehabil Eng 15(1):111–118
Article Google Scholar
Khezri M, Jahed M (2010) A neuro-fuzzy inference system for sEMG-based identification of hand motion commands. IEEE Trans Ind Electron 58(5):1952–1960
Article Google Scholar
Lu X, Shen P, Tsao Y, Kawai H (2016) A pseudo-task design in multi-task learning deep neural network for speaker recognition, 10th International Symposium on Chinese Spoken Language Processing, 1–5

Download references

Author information

Authors and Affiliations

College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, China
Shaoyang Hua, Congqing Wang, Zuoshu Xie & Xuewei Wu

Authors

Shaoyang Hua
View author publications
You can also search for this author in PubMed Google Scholar
Congqing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zuoshu Xie
View author publications
You can also search for this author in PubMed Google Scholar
Xuewei Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shaoyang Hua.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Hua, S., Wang, C., Xie, Z. et al. A force levels and gestures integrated multi-task strategy for neural decoding. Complex Intell. Syst. 6, 469–478 (2020). https://doi.org/10.1007/s40747-020-00140-9

Download citation

Received: 07 January 2020
Accepted: 29 March 2020
Published: 06 May 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s40747-020-00140-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A force levels and gestures integrated multi-task strategy for neural decoding

Abstract

Similar content being viewed by others

Electromyographic Signal Based Dynamic Hand Gesture Recognition Using Transfer Learning

Improving Gesture Recognition by Bidirectional Temporal Convolutional Netwoks

STCN-GR: Spatial-Temporal Convolutional Networks for Surface-Electromyography-Based Gesture Recognition

Introduction