Abstract
In the field of highend manufacturing, it is valuable to study fewshot health condition estimation. Although transfer learning and other methods have effectively improved the ability of fewshot learning, they still cannot solve the lack of prior knowledge. In this paper, by combining data enhancement, knowledge reasoning, and transfer learning, a generative knowledgebased transfer learning model is proposed to achieve fewshot health condition estimation. First, with the effectiveness of data enhancement on machine learning, a novel batch monotonic generative adversarial network (BMGAN) is designed for fewshot health condition data generation, which can solve the problem of insufficient data and generate simulated training data. Second, a generative knowledgebased transfer learning model is proposed with the performance advantages of the belief rule base (BRB) method on fewshot learning, which combines expert knowledge and simulated training data to obtain a generalized BRB model and then finetunes the generalized model with real data to obtain a dedicated BRB model. Third, through uniform sampling of NASA lithium battery data and simulating fewshot conditions, the generative transferbelief rule base (GTBRB) method proposed in this paper is verified to be feasible for fewshot health condition estimation and improves the estimation accuracy of the BRB method by approximately 17.3%.
Introduction
In recent years, datadriven methods such as deep learning have made remarkable achievements in many fields [1]. However, it is worth noting that, in addition to images and natural language processing, which have diverse Internet resources, there are relatively few research fields that can truly collect a large quantity of data [2]. Especially for highend manufacturing, such as aerospace, the cost of health condition data collection is very high. Therefore, it is valuable to realize the fewshot health condition estimation of such complex systems [3]. Many machine learning methods require massive quantities of labeled data for training to improve their performance [4]. Therefore, for the field of health condition estimation, research methods such as neuroadaptive networks [5], support vector machines [6], Gaussian process regression [7], and hidden Markov models [8], which are all datadriven methods in a broad sense, are difficult to directly apply to the fewshot learning field that exists widely in daily conditions [9]. However, to the best of our knowledge, there are currently no studies specifically aimed at fewshot health condition estimation. However, scholars have applied fewshot learning theory in the fields of fault diagnosis [10], image classification [11], target detection [12], etc. Therefore, studying fewshot health condition estimation has both theoretical and practical significance.
The fewshot learning theory is a new learning method proposed to solve the insufficient information of fewshot datasets, which can also be called small samples. It studies how to train an intelligent and effective machine recognition model by a small number of training samples (usually dozens of magnitudes, or a single sample, or even zero sample) [13]. It is generally believed that a fewshot learning methods can be divided into the following. (1) Data enhancement methods improve the performance of fewshot learning by expanding training samples, but this kind of method does not fundamentally solve the problem of fewshot learning [14]. (2) The metric learning methods simulate the distance distribution between the samples, which is an embedded space, make the samples of the same class close to each other and the samples of different classes far away from each other. However, this kind of method can easily lead to overfitting [15]. (3) Initialization methods train the model in the source domain and then finetuning on the target domain to achieve fast iteration and good generalization ability, including transfer learning, in which the source and target domains are similar [16], and metalearning in which clouds learn to learn. Among them, metalearning methods have developed rapidly. For example, Wang et al. proposed a metricbased metalearning model for smallsample fault diagnosis [17]. Ding et al. proposed metadeep learning for implementing smallsample rotating machinery health prognostics [18]. The essence of these methods is to achieve efficient fewshot learning by acquiring metaknowledge. However, the current understanding of metaknowledge is not deep enough, so the generalization ability of related metalearning needs to be improved [19]. However, in fewshot learning research, another very effective method is often overlooked, which is the method of knowledge reasoning. Compared with abstract metaknowledge, expert knowledge is very specific and vivid. Since expert knowledge is integrated with past learning experience, it can effectively improve the learning ability of small samples [20].
To further understand fewshot learning, we find that the data enhancement methods start from the data level and improve the fewshot learning ability by mining data features and expanding the amount of training data [21]. The metric learning methods start from the sample level and improve the learning ability of small samples by understanding the overall distribution of the sample [22]. The initialization methods start from the knowledge level by learning the source domain, which is related to the target domain and obtaining basic knowledge to improve the fewshot learning ability of the target domain [23]. There are attempts to solve the fewshot learning problem from three different levels, and a single kind of method has certain drawbacks. Therefore, it will be interesting to merge the above methods.
Among them, although the data enhancement method cannot fundamentally solve the problem of fewshot learning, it has been proven to be beneficial to deep machine learning and is very effective for fewshot learning [24]. Therefore, it can be used as an auxiliary method for other fewshot learning methods. Early data enhancement methods used basic changes such as translation, rotation, and shear to the existing data to obtain a richer variety of generated data, thereby avoiding the appearance of overfitting.
In recent years, some new data enhancement methods have emerged, such as generative adversarial networks (GANs) [25], disturbance compensation [26], and feature space enhancement methods [27]. Among them, Goodfellow et al. proposed a dual network structure that optimizes the generation model through the adversarial process, which has good training efficiency and generation effect. After training, the GAN can fully mine data features and achieve good data enhancement.
In past research on device health condition estimation, the methods used have a precondition, that is, the training data and test data satisfy the same distribution [28], which is one of the foundations of the research. However, we should not totally ignore that this precondition is almost nonexistent in the daily condition, just as there are no two leaves in the world that are exactly the same. Even the working environment of the same kind of devices cannot be completely similar, which makes the decline in their health condition different [29]. This assumption facilitates our research, but we should not completely ignore the existence of differences in the training data and test data distributions. Especially, for the fewshot health condition estimation, because the fewshot data contain less information, it may result in a poor model training effect [30]. Therefore, the initialization methods mentioned above, especially the transfer learning method, can fully learn from a sufficient quantity of source domain data, which is similar to the target domain and can achieve accurate estimation of the target domain without satisfying the assumption of independent and identical distribution [31].
The difficulty of fewshot learning is the lack of prior knowledge caused by insufficient data. Therefore, how to effectively improve the prior knowledge level of learning methods will be the key point to solving fewshot learning problems [32]. However, Tang et al. demonstrated that knowledge reasoning methods such as the belief rule base (BRB) have better fewshot learning capabilities than datadriven methods such as neural networks [20].
Therefore, a novel idea is to combine the data enhancement ability of GAN and the fewshot learning ability of BRB to achieve a more accurate fewshot health condition estimation. This paper consists of the following parts. A literature review on GANs and BRBs is presented in Sect. “Literature review”. In Sect. “Batch monotonic GAN”, we introduce the basic idea of the generative model and GAN and propose a batch monotonic GAN for fewshot data generation. In Sect. “Generative transferbelief rule base”, we propose a generative transferbelief rule base (GTBRB) model and describe the implementation process of the GTBRB. In Sect. “Case study”, a fewshot dataset is simulated using NASA lithium battery data to validate the GTBRB with and without auxiliary training data separately. Section 6 presents the conclusions and future directions.
Literature review
Fewshot data generation is one of the latest research areas of GAN, among which localfusion GAN (LoFGAN) is proposed to fuse local representations for fewshot image generation [33]. Fewshot GAN (FSGAN) uses component analysis techniques for adapting GANs in fewshot settings (fewer than 100 images) [34]. Matchingbased GAN (matching GAN) is proposed for fewshot image generation, which includes a matching generator and a matching discriminator [35]. It can be seen that the current generation methods of fewshot data based on GAN are concentrated in the image field. Relevant studies have shown that device health condition data generally present an overall monotonic characteristic [36], which is quite different from the characteristic of the differential distribution of image data. Therefore, the methods for fewshot data generation in the image field are not suitable for such data as the fewshot health condition data of devices. Therefore, this paper designs a novel batch monotonic GAN (BMGAN) for the fewshot data generation of the device health condition.
Yang et al. proposed a belief rule base (BRB) method based on rule reasoning and datadriven thinking, which can realize the effective combination of expert knowledge and data training and has good nonlinear relationship fitting ability [37]. In recent years, related research on BRBs has mainly focused on initial parameters automatically generating methods for largescale BRBs [38], rule reduction and training methods for extended BRBs [39], and hybrid BRBs for safety assessment with data and knowledge under uncertainty [40]. It is regrettable that there are few studies about the application of the BRB method for fewshot conditions, especially research on the combination of the BRB method and transfer learning. Therefore, this paper will first use the fewshot real data to generate a larger quantity of simulated training data through GAN, second, combine the simulated training data and expert knowledge to train a generalized BRB model, and third use real data to finetune the generalized model to obtain a dedicated BRB model to improve the accuracy of fewshot health condition estimation.
The work of this paper is based on the following foundations. (1) GAN has excellent data generation capabilities, but the existing research on fewshot data generation focuses on the image field, and it is difficult to adapt to the overall monotonic characteristics of the device health condition. (2) The BRB method has good fewshot learning capabilities, but there are few publications about the effective transfer learning architecture of knowledge inference methods such as BRB. (3) Fewshot data are the normal state of the actual device operation process, but there are few publications about the fewshot health condition estimation.
The main contributions of this paper are as follows:

We propose a batch monotonic GAN model, which solves the problem that traditional GANs can only generate simulation data with the same dimensions as the input data.

We provide a transfer learning architecture of a knowledgebased method such as BRB and introduce its implementation process with or without auxiliary data.

We analyze and prove that the combination of data enhancement and expert knowledge can effectively solve the problem of fewshot health condition estimation.
Batch monotonic GAN
In this section, the basic idea of the generative model and GAN is introduced. Then, by analyzing the implementation process of GAN, it is concluded that traditional GAN cannot generate data of different dimensions. Therefore, a batch monotonic method is proposed as the loss function of GAN, which can generate simulation data with different dimensions from real data.
The basic idea of the generative model and GAN
The difficulty of fewshot learning is the lack of sample quantity and quality. It is difficult to learn the complete distribution of data through limited data. The most direct method for solving the lack of data is to generate simulated data by learning the data distribution and prior knowledge, which is the basic idea of generative models.
In the context of statistics, a generative model refers to a mathematical model that randomly generates a simulated data sequence and a joint probability distribution of the simulated data and annotated result sequence under a specific implicit mapping relationship. In the artificial intelligence field, generative models are used to sample data based on the probability distribution and to generate expanded datasets by learning existing data structures. This process can be defined as follows:
where x is the real data, x' is the generated data, sample represents the sampling operation, L represents the learning function of the probability distribution, and G represents the data generation based on the learned probability distribution. The generative model can be divided into two types according to its function. The first obtains the exact distribution function of the dataset by learning the given data. The second generates new data under the premise of a fuzzy data distribution function. The generative adversarial network used in this paper is the second type of model.
The generative model has been proven to be able to effectively simulate and deal with highdimensional distribution problems, and has produced much interesting research with reinforcement learning, semisupervised learning, etc. In practical applications, such as image enhancement and artistic creation are very valuable research content. Early elementary numerical transformation data generation methods essentially reorganize the original data in disorder, are unable to effectively capture sample characteristics, and may even produce very unbelievable results. Then, learning style generation methods such as autoencoders are applied. This method produces new data similar to the original data distribution, which means that the data features have not been learned, and overfitting often occurs. GAN brings a new idea to generative models through adversarial learning and has also achieved effective applications in data generation and other fields. The generative adversarial process of GAN is shown in Eq. 2
GAN draws on the idea of a zerosum game and constructs a discriminator (D) and a generator (G) simultaneously, where \(\theta\) is the corresponding network parameter set. The cost function of the discriminator is denoted as \(J^{(D)}\), and its calculation method is
Since the generator and the discriminator are in a zerosum game adversarial state, the relationship of their cost function is
Combine the cost functions of the generator and the discriminator into a unified value function
In the adversarial training process, the generator hopes \(V(\theta^{(D)} ,\theta^{{({\text{G}})}} )\) to be as small as possible, while the discriminator is the opposite, thus establishing a game process.
In recent years, GANs have been widely used in the fields of data generation, image processing, and style transfer. The typical structure of a GAN is shown in Fig. 1.
The computational structure of GAN
This section analyses the mathematical principles of generator training.
First, assume that the generator G is fixed. Set \(G(z) = x\)
The current problem is transformed into finding a D that maximizes V, using the feature that the reciprocal of the extreme point is zero, and the optimal solution of \(D(x)\) obtained by solving is
Obviously, the value of \(D^{*} (x)\) is between 0 and 1. When real data are input, the judgement value of the discriminator should be as close to 1 as possible, and when the simulated data are input, the judgement value of the discriminator should be as close to 0 as possible. When the distribution of the simulated data is very close to the real data, the mean judgement value should tend to be \(\frac{1}{2}\).
Use the conclusion about \(D^{*} (x)\) to analyze the calculation method of the ideal generator \(G^{*}\)
Then, we transform it into the form of the Jenson–Shannon divergence based on the Kullback–Leibler divergence, and the result after sorting is as follows:
Because, \({\text{JSD(}}x{)} \ge {0}\), if and only if \(P_{data} = P_{g}\), that is, when the ideal generator state is reached, the global minimum of \(\mathop {\max }\nolimits_{D} V(G,D) =  \log(4)\) is obtained.
The pseudocode of the basic GAN is given below, where k is the number of iterations of the discriminator.
Fewshot data generation based on BMGAN
In health condition estimation research, health condition data are often a set of smallsample timeseries data. Due to the different working conditions of each device, even the same kind of device also has a diversity of health condition data. However, if a traditional GAN is directly used to generate health condition data, it is easier to have gradient instability and mode collapse. The main reason is the insufficient representation of smallsample data, which easily causes overfitting of complex generative networks. Therefore, it is necessary to improve the traditional GAN according to the characteristics of fewshot health condition data generation.
Relevant studies have proven that the characteristics of the device health condition exhibit overall monotonicity at the data distribution level. This paper proposes a fewshot overall monotonicity function, and the calculation method is as follows:
For fewshot data generation, a key issue that needs to be solved is how to measure the similarity between fewshot real data and largebatch generated data. For traditional GAN, the basis for calculating the similarity between the generated data and the real data is crossentropy, and the calculation method is as follows:
From the crossentropy calculation method, it can be known that the traditional GAN requires the generated data to have the same dimensions as the real data. In device fewshot health condition estimation research, the generated data need to have a higher dimension and richer distribution characteristics than the real data. Therefore, this paper designs an average crossentropy to realize the similarity calculation between the fewshot real data x and the largebatch generated data G(z)
where N represents the magnification factor, generally a positive integer. z^{i} and G(z^{i}) represent the random vector and generated data of the ith batch. The average crossentropy is used to solve the dimensional imbalance between the fewshot real data and the largebatch generated data, so that the largebatch generated data can have a data distribution similar to the fewshot real data.
In this paper, this method is called batch monotonic GAN (BMGAN), and its implementation process is as follows.
In this section, aiming at the distribution characteristic of the device health condition, an overall monotonicity function is designed to calculate the monotonicity distribution of real data and generated data. To solve the data dimension imbalance between the fewshot real data and the largebatch generated data, average crossentropy is designed to realize the similarity calculation between data of different dimensions. The two methods are combined flexibly to realize fewshot data generation based on BMGAN.
Generative transferbelief rule base
In this section, the basic BRB and its inference method are introduced, then the GTBRB method based on transfer learning is designed, and finally, the implementation process of the GTBRB method with or without auxiliary data is detailed.
The basic BRB model
The BRB method integrates theories such as DS evidence reasoning, fuzzy sets, and the IF–THEN rule base. With the support of certain expert knowledge, it can effectively deal with incomplete or inaccurate information. It is very suitable for fewshot health condition estimation.
The foundation of the BRB method is the IF–THEN rule base, with the addition of rule weights, antecedent attribute weights, and confidence inference methods. The expression method of the kth rule of BRB is
Among them, \(x_{i}\) is the ith input of the BRB system, \(i = 1,2, \cdots ,T_{k}\). \(A_{i}^{k}\) is the reference value of the ith antecedent attribute in the kth rule, \(k = 1,2, \cdots ,L\). \(\beta_{j,k}\) is the confidence of the jth evaluation level in the kth rule, \(j = 1,2, \cdots ,N\). \(\theta_{k} \, \) is the rule weight corresponding to rule k, and \(\delta_{k}\) is the weight value of the kth antecedent attribute.
Due to the variety of data types, the BRB system will normalize before processing the data and combine the membership function to calculate the conversion value of each input. The calculation method of this conversion technology is as follows:
Inference method of the BRB.
As a typical expert knowledge system, the reliable operation of BRBs relies on accurate knowledge expression and reasonable rule inference. The inference process of the BRB consists of the following two parts:
(1) Calculation of the activation weight of the rule.
The activation weight \(w_{k}\) changes dynamically according to the input data. The calculation method of the activation weight \(w_{k}\) corresponding to the kth rule is
where \(w_{k} \in [0,1],\quad k = 1,2, \cdots ,\).
(2) Data fusion and rule inference methods.
First, calculate the confidence \(\hat{\beta }_{j}\) of the evaluation result \(D_{j}\), according to the activation weight \(w_{k}\) of rule k and the confidence \(\beta_{j,k}\) of the jth evaluation level
After obtaining the confidence level \(\hat{\beta }_{j}\) of each \(D_{j}\), the output result of the BRB is formed
However, for the health condition estimation, it is also necessary to form a comprehensive estimation result, so it is also necessary to calculate a final result from \(D_{j}\). The calculation method is as follows:
Then, the expected utility of the output of the entire BRB is
Therefore, the final result \(\hat{y}\) of the health condition estimation is
Transfer learning of GTBRB
The basic idea of transfer learning is to first train a generalized model on an existing dataset, which is the same or similar to the training data, such as data from the same device in different periods, or the data of the same kind of device under different working conditions. Second, it establishes the transfer relationship between the existing dataset and the training dataset, usually mining the functional relationship that can be used as the training data feature in the generalization model. Finally, through the determined transfer relationship, the training data are input to finetune the generalization model to obtain a dedicated model to realize transfer learning.
The advantages of transfer learning are as follows. (1) Transfer learning effectively improves the efficiency of fewshot learning. The initial generalization model is obtained through training with a large quantity of existing data, which can effectively reduce the cost of fewshot learning model training. (2) Transfer learning solves the difficulty of feature extraction in small samples to a certain extent. By training the generalization model, an effective reference feature set can be obtained for the fewshot dataset. (3) Transfer learning effectively improves the generalization learning ability of fewshot learning. Learning the relatively rich features of existing datasets effectively avoids the overfitting phenomenon, which easily occurs in fewshot learning.
Currently, transfer learning is widely used in datadriven methods such as neural networks, but transfer learning of knowledge reasoning methods has not yet been published. This paper draws on the idea of transfer learning and proposes a generative knowledgebased transfer learning architecture. Specifically, the data generation capability of GAN is used, taking the generated data as the source domain and the real data as the target domain. Then, combined with the prior knowledge of BRBs, we attempt to solve the problem of fewshot health condition estimation. The generative transfer BRB (GTBRB) method proposed in this paper has two main application scenarios. The main difference is whether it contains auxiliary training data, which are different from the test data.
The first scenario is without auxiliary training data. The training set and the test set belong to the same kind of device, which is called the same domain data in this paper. The following process is used to carry out the fewshot health condition estimation: (1) the fewshot training data are expanded by GAN to generate a large quantity of simulated training data, (2) the BRB model is trained using simulated training data combined with expert knowledge to obtain the generalized BRB model, and (3) fewshot training data are used to finetune the generalized BRB model to obtain a dedicated BRB model.
The second scenario is that there are some auxiliary training data, such as relatively enough data of the same or similar devices under different working conditions, and the current device has fewer available data, which is called foreign domain data in this paper. The following process is used to carry out the fewshot health condition estimation. (1) Use GAN to expand the fewshot training data of the current device and generate a certain quantity of simulated training data based on the magnitude of the auxiliary training data. (2) Simulated training data, auxiliary training data and expert knowledge are used to train the BRB model into a generalized BRB model. (3) Use the fewshot training data of the current device to finetune the generalized BRB model to obtain a dedicated BRB model. The implementation process is summarized in Fig. 2.
Case study
In this section, the experimental background and data sources are introduced, the fewshot data generation ability of BMGAN is verified, and finally, the fewshot health condition estimation ability of GTBRB with or without auxiliary data is verified.
Background formulation
Traditional health condition estimation methods generally require that the training data and test data meet the premise of independent and identical distribution. However, in engineering practice, due to the different working conditions and many other factors, this premise is actually difficult to meet. A typical example is lithium batteries, which are widely used in daily life, such as mobile phones, computers, electric vehicles, and even aerospace devices. Lithium batteries are present in every aspect of our lives. Therefore, studying the health condition estimation of lithium batteries has very important practical significance. However, due to the long ageing cycle of lithium batteries, there are fewer degradation experimental data with high reliability, which is a typical fewshot health condition estimation problem. In the past, when studying the health condition of lithium batteries with a very complicated use environment through experimental data of a small number of cycles or single working conditions, it is very easy to produce an overfitting phenomenon, resulting in insufficient generalization ability of the estimation model. Therefore, using the knowledge transfer learning method, which is based on the generated data, to estimate the fewshot health condition of lithium batteries has the dual value of theory and practice.
The nature of the fewshot problem is that the amount of information contained in the small sample is insufficient, which can be manifested as sparse sampling or unbalanced sampling of the sample, resulting in an insufficient quantity of data or uneven distribution. To avoid the randomness of sampling from affecting the experimental results, this paper adopts the average sampling method to generate a fewshot dataset.
The accumulated relative error (ARE) is used to indicate the cumulative change in the relative error between the generated data and the true value in this experiment. The mean relative error (MRE) represents the absolute value of the ratio of the deviation between the predicted value and the true value to the true value. MRE can more intuitively reflect the deviation between the predicted value and the true value by normalizing the error value to the interval [0, 1]. They are defined as follows:
Data enhancement based on BMGAN
This paper uses NASA’s lithium battery ageing dataset. In the same domain experiment, only battery #0007 (B7) is used. In the foreign domain experiment, battery #0005 (B5) and battery #0006 (B6) are selected as the auxiliary training data for B7.
Related research has proven that the discharge voltage difference time interval (TIEDVD) [41] (that is, the time it takes for the battery to drop a fixed voltage during discharge) and mean temperature (MT) [20] during TIEDVD have a certain relationship with the capacity of lithium batteries. The definitions of TIEDVD and MT are shown in Fig. 3.
The mathematical definition of TIEDVD and MT is
In addition, B7’s TIEDVD, MT, and capacity changes over time are shown in Fig. 4.
In Fig. 4, it can be intuitively recognized that TIEDVD and MT are related to the health condition, which means the capacity, of lithium batteries. Therefore, TIEDVD and MT are used as BMGAN simulation objects in the GTBRB method. Combining the fewshot conditions and existing research results proposed in the previous section, BMGAN's generator and discriminator both use deep neural networks. Taking the first 160 sets of data from NASA batteries, the sampling interval is 5, and the number of samples is 32. The number of hidden layers of the generator and discriminator of BMGAN is set to 2, the number of input layer nodes of the generator is 32, the number of output layer nodes is 160, and the numbers of nodes of the two hidden layers are 600 and 1,000. The number of nodes in the input layer of the discriminator is 160, the number of nodes in the output layer is 1, the numbers of nodes in the two hidden layers are 800 and 300, and each layer directly adopts a fully connected mode. The output result of the discriminator is 0 or 1, where 0 represents that the discrimination result is the data generated by the generator and 1 represents that the discrimination result is the real data. The training rounds of the network are 1,200, and the learning rate is 0.002.
Average sampling of B7’s TIEDVD and MT with an initial sampling number of 32, and the result is shown in Fig. 5.
When the training rounds are 200, 600, and 1,000, the distribution of simulated data generated by BMGAN during a training process is shown in Fig. 6.
It can be seen from the above experimental results that as the number of training rounds increases, BMGAN gradually learns the distribution of small samples, and there is a certain difference from the original samples, thus ensuring the diversity of simulated training data. Since the generated results of BMGAN have a certain degree of randomness, taking the average of multiple experimental results make the generated results too smooth, which is not conducive to the diversity of data. Thus, we only show the results of a certain training process. The BMGAN generation results in different training rounds are very close, with only slight fluctuations.
Next, we verify the fewshot data generation ability of BMGAN and select GAN, linear regression (LR), and uniform interpolation (UI) for comparison experiments. Among them, the objective function of the GAN method uses average crossentropy, 160 sample points are uniformly sampled after linear regression, and the interpolation method evenly inserts four generated data between every two real data. The results of the accumulative relative error between the generated data and the real data of TIEDVD are calculated as shown in Table 1.
When the number of samples is 10, BMGAN has a very large performance improvement compared to the GAN and LR methods and is close to the UI method. In this case, the fewshot real data generally show a downwards trend, and λ plays a leading role, making the generated data of BMGAN generally have a downwards trend. When the number of samples is 20 or 30, BMGAN maintains the best generation accuracy. When the number of samples is 50, which is equivalent to an average of three real data samples to collect one point, due to certain fluctuations in health condition data such as TIEDVD, this is also a true response. At this time, the randomness generated by BMGAN plays a leading role, making the generated data have a similar overall distribution to the fewshot real data, which still has some random fluctuations, so the generation accuracy is higher than linear methods such as LR and UI.
Through the above experimental results, it can be confirmed that BMGAN has good data generation ability under fewshot conditions.
GTBRB of the same domain
The previous section analyzed the role of the GAN in the GTBRB model, and this section introduces the role of the BRB in the GTBRB model.
When using the BRB method, it is important to combine expert knowledge to set an appropriate initial parameter. Based on related research and the particularity of fewshot data, this paper adopts the following initial parameters.
TIEDVD is set as attribute 1 and divided into short, medium, and long parts, and MT is set as attribute 2 and divided into low and high parts. Set the capacity as the output result and divide it into four parts: good, medium, bad, and invalid
To effectively compare the estimation capability of the GTBRB method, we select Gaussian process regression (GPR) with better learning ability for a small sample, a backpropagation neural network (BPNN) as the typical datadriven method, initial BRB method, and the generalized BRB method proposed in this paper for comparison. TIEVD and MT with a sample number of 30 are used to evaluate the remaining battery capacity.
Among them, BPNN is a fourlayer network structure, in which the first layer is an input layer containing two neuron nodes, representing TIEDVD and MT. The middle two layers are hidden layers with 16 nodes, and the third layer is an output layer. The training target error of the network is 0.002, the learning efficiency is 0.3, and the number of training rounds is set to 5000. The parameter settings of the initial BRB are consistent with Table 2. The optimization of each BRB model uses the fmincon function in MATLAB, the maximum number of iterations is set to 1000, and the termination error is \(10^{  6}\).
In the parameters in Table 2, all \(\theta_{k}\) and \(\delta_{k}\) are 1, indicating that the weights of each rule and the weights of the premise attributes (TIEDVD and MT) are the same under the initial conditions. Additionally, the value of \(\beta_{j,k}\) denotes the credibility of the rating under the current rules. After being trained by the same domain simulated data, the parameter values of the generalized BRB model are shown in Table 3. The parameter change values of the dedicated BRB model obtained using real data training the generalized model are also shown in the last row of Table 3.
Table 3 shows that the parameters of the BRB model can be adjusted accordingly by inputting the corresponding data, thereby achieving a balance between expert knowledge and datadriven methods. Taking the change in the \(\delta_{k}\) value as an example, since TIEDVD decreases almost monotonically, while MT fluctuates significantly, MT plays a more critical role in multirule fusion. After training, \(\delta_{1}\) is significantly smaller than \(\delta_{2}\), which realizes the optimization and adjustment of parameters.
When the data size of sampling is 30, the experimental results of each method are shown in Fig. 7.
First, we analyze the differences in various types of methods. When the data size is 30, because the data are uniformly sampled and the distribution of the sample is close to the overall distribution of the battery, the traditional machine learning model can learn the ageing trend of the battery. However, due to the sparse data size, the learning model has an underfitting problem, and the estimation accuracy of the test data is poor. Among them, the BPNN method is better than the GPR method, and in the case of small samples, the GTBRB based on transfer learning proposed in this paper shows better estimation accuracy. The information of a small sample is improved through expert knowledge, and then the generation model is used to increase the diversity of samples and training to obtain a generalized BRB model. Then, real data are used to finetune the generalized BRB model to obtain a dedicated BRB model.
Second, we analyze the differences in the BRB method in different application scenarios. It can be seen in the lower part of Fig. 7 that the initial BRB estimation accuracy is poor. Although expert knowledge is used, there may be a certain deviation due to expert knowledge, and a fewshot dataset contains insufficient information, so the performance of the initial BRB is not good. The generalized BRB model effectively improves the estimation accuracy of the initial BRB method due to the increase in the training data size, but due to the diversity of the generated data, the estimation result is relatively unstable. The GTBRB method combines data enhancement methods and transfer learning ideas and uses real data to finetune the generalized BRB model to achieve a more accurate health condition estimation.
The experimental results show that when the uniform sampling data size is 30, GTBRB improves the performance of the BRB method, and its performance is also better than typical methods such as GPR and BPNN and achieves a more accurate health condition estimation. To further analyze the improvement of the GTBRB on the BRB estimation method, the BRB method is set up with the same initial parameters as the GTBRB, and the parameters are directly optimized using real data.
While changing the size of the sampling data, the above experiment was repeated ten times, and the average result was obtained, as shown in Table 4.
From the above experimental results, it can be seen that when the data size is small, such as when only 10 real data are used, the MRE of each method is 10% larger, and accurate health condition estimation cannot be achieved. However, the MRE of the GTBRB method is 12.5%, which is relatively lower than other methods, and it has a certain more accurate fewshot health condition estimation ability. As the quantity of data increases, the estimation performance of each method improves; in particular, the performance of the BPNN method improves the most, which reflects the sensitivity of the neural network method to the size of the training data. When the quantity of data is less than 10, 20, or 30, the BRB method has certain advantages over other health condition estimation methods, which reflects the effectiveness of expert knowledge under the condition of a small sample. However, as the data size increases to 50, the estimation accuracy of the initial BRB method is already lower than that of the GPR and BPNN methods, but the GTBRB method is still better than GPR and BPNN through the improvement of data enhancement and transfer learning. The above results reflect the effectiveness of the GTBRB method proposed in this paper and the improvement effect of the initial BRB method. Under the four experimental conditions, the estimation accuracy of the GTBRB is relatively 17.3% higher than that of the BRB method on average. Therefore, the GTBRB method proposed in this paper effectively improves the fewshot health condition estimation ability of the BRB method.
GTBRB of the foreign domain
To further solve the lowaccuracy problem of fewshot health condition estimation, this paper considers introducing auxiliary training data. Combining the experiments carried out in the above sections, we select B5 and B6, which are the same type as B7 but under different working conditions. Among them, the distributions of B5 and B7 are similar, and the differences between B6 and B7 are relatively large. The ageing distributions of the three batteries are shown in Fig. 8.
To study the influence of different auxiliary training data on the fewshot learning ability, two sets of auxiliary training data (B5 and B6) combined with fewshot test data (B7) of different data sizes are used to carry out the following experiments. To simplify the research process, this section only studies the auxiliary training data and the simulated training data when their data sizes are the same (subject to the size of the auxiliary training data). The MRE promotion value d_MRE is used as the evaluation index. The calculation method is the MRE value in the experiment of Sect. “Transfer learning of GTBRB” minus the MRE value of the same test data, and the evaluation method with the auxiliary training data is introduced to the training process in this section. The average result is obtained, as shown in Fig. 9.
The above experimental results show that the experimental results of the B5 group are better than those of the B6 group. Because the distribution of B5 and the test data B7 are closer, the difference between B6 and B7 is relatively large. When the test data size is 10, that is, the sample size is extremely small, whether it is B5 or B7, both highly improve the estimation accuracy of the original method. When the quantity of data is 20, the performance of the BPNN method is significantly improved, which is affected by the randomness of the sampled data. When the quantity of data is 30, the positive impact of the auxiliary training dataset becomes insufficient, and B6 even has a negative impact on the GPR method. When the quantity of data is 50, only B5 has a positive impact on the GPR method, and the remaining groups all have a negative impact. This shows that when the training data size is relatively sufficient, the introduction of foreign domain auxiliary training data is likely to reduce the accuracy of the original fewshot health condition estimation methods.
Finally, the time complexity of the mentioned methods is analyzed. The essence of BPNN feedforward calculation and error backpropagation is matrix multiplication. Under the premise that the input layer and output layer are fixed, it is only related to the number of neurons in the hidden layer. Taking a threelayer neural network as an example, its time complexity is O(N). The time complexity of the GPR method is O(N^{2}), because it needs to solve a triangular linear equation system in the solution process. As a rule inference method, the time complexity of the BRB is O(1), which means that the GTBRB method proposed in this paper has extremely high computational efficiency.
Conclusions
As fewshot cases are a widespread phenomenon in daily conditions, this paper focuses on fewshot health condition estimation. The GTBRB method proposed in this paper is a novel generative knowledgebased transfer learning architecture that effectively combines data augmentation, knowledge reasoning, and transfer learning. Through experiments, it is found that the GTBRB is feasible for fewshot health condition estimation and improves the estimation accuracy of the BRB method by approximately 17.3%. In addition, GTBRB, as a knowledge reasoning method, has a time complexity of only O(1), and its computational efficiency is significantly better than other datadriven methods. However, there are still some shortcomings in the proposed method. On the one hand, in the GTBRB method, data enhancement, and condition estimation are performed sequentially. This oneway information transfer may cause the accumulation of errors. Once the generated data deviate greatly from the real data, the subsequent fewshot health condition estimation accuracy will be affected. On the other hand, as a knowledge reasoning method, although BRB can use real data to adjust the parameters preset by experts, it still cannot effectively balance the influence of expert knowledge and datadriven methods on the results of fewshot health condition estimation. Therefore, further research can focus on the dynamic interaction of data enhancement and condition estimation, stepbystep input to generate data, and moderately guide data generation with estimation results. In addition, fusing knowledge reasoning methods with fewshot datadriven methods such as metalearning is also an interesting idea.
References
Elattar HM, Elminir HK, Riad AM (2016) Prognostics: a literature review. Complex Intell Syst 2(2):125–154
Lim JY, Lim KM, Ooi SY, Lee CP (2021) EfficientPrototypicalNet with self knowledge distillation for fewshot learning. Neurocomputing 459:327–337
Jiang C, Chen H, Xu Q, Wang X (2022) Fewshot fault diagnosis of rotating machinery with twobranch prototypical networks. Journal of Intelligent Manufacturing, 1–15.
Lu J, Liu A, Song Y, Zhang G (2020) Datadriven decision support under concept drift in streamed big data. Complex Intell Syst 6(1):157–163
Yang G, Yao J, Ullah N (2021) Neuroadaptive control of saturated nonlinear systems with disturbance compensation. ISA transactions.
Kim HE, Tan AC, Mathew J, Kim EY, Choi BK (2012) Machine prognostics based on health state estimation using SVM. In Asset condition, information systems and decision models (pp. 169–186). Springer, London.
Sheng H, Liu X, Bai L, Dong H, Cheng Y (2021) Small sample state of health estimation based on weighted Gaussian process regression. J Energy Storage 41:102816
Kumar A, Chinnam RB, Tseng F (2019) An HMM and polynomial regression based approach for remaining useful life and health state estimation of cutting tools. Comput Ind Eng 128:1008–1014
Sung F, Yang Y, Zhang L, Xiang T, Torr PH, Hospedales TM (2018). Learning to compare: Relation network for fewshot learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1199–1208).
Li C, Li S, Zhang A, He Q, Liao Z, Hu J (2021) Metalearning for fewshot bearing fault diagnosis under complex working conditions. Neurocomputing 439:197–211
Liu B, Yu X, Yu A, Zhang P, Wan G, Wang R (2018) Deep fewshot learning for hyperspectral image classification. IEEE Trans Geosci Remote Sens 57(4):2290–2304
Shi Y, Li J, Li Y, Du Q (2020) Sensorindependent hyperspectral target detection with semisupervised domain adaptive fewshot learning. IEEE Transactions on Geoscience and Remote Sensing.
Rahman S, Khan S, Porikli F (2018) A unified approach for conventional zeroshot, generalized zeroshot, and fewshot learning. IEEE Trans Image Process 27(11):5652–5667
Wertheimer D, Tang L, Hariharan B (2021) Fewshot classification with feature map reconstruction networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8012–8021).
Xue Z, Xie Z, Xing Z, Duan L (2020) Relative position and map networks in fewshot learning for image classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 932–933).
Yu X, Yang T, Lu J, Shen Y, Lu W, Zhu W, ... Zhou J (2021) Deep transfer learning: a novel glucose prediction framework for new subjects with type 2 diabetes. Complex & Intelligent Systems, 1–13.
Wang D, Zhang M, Xu Y, Lu W, Yang J, Zhang T (2021) Metricbased metalearning model for fewshot fault diagnosis under multiple limited data conditions. Mech Syst Signal Process 155:107510
Ding P, Jia M, Zhao X (2021) Meta deep learning based rotating machinery health prognostics toward fewshot prognostics. Appl Soft Comput 104:107211
Xu Y, Li Y, Wang Y, Zhong D, Zhang G (2021) Improved fewshot learning method for transformer fault diagnosis based on approximation space and belief functions. Expert Syst Appl 167:114105
Tang X, Xiao M, Liang Y, Zhu H, Li J (2019) Online updating beliefrulebase using Bayesian estimation. KnowlBased Syst 171:93–105
Ye HJ, Sheng XR, Zhan DC (2020) Fewshot learning with adaptively initialized task optimizer: a practical metalearning approach. Mach Learn 109(3):643–664
Xie Z, Cao W, Ming Z (2021) A further study on biologically inspired feature enhancement in zeroshot learning. Int J Mach Learn Cybern 12(1):257–269
Chen M, Fang Y, Wang X, Luo H, Geng Y, Zhang X, Wang B (2020) Diversity transfer network for fewshot learning. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 07, pp. 10559–10566).
Mishra A, Verma VK, Reddy MSK, Arulkumar S, Rai P, Mittal A (2018) A generative approach to zeroshot and fewshot action recognition. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV) (pp. 372–380). IEEE.
Sarkar A (2021) Generative adversarial network guided mutual learning based synchronization of cluster of neural networks. Complex & Intelligent Systems, 1–15.
Yang G, Wang H, Chen J (2021) Disturbance compensation based asymptotic tracking control for nonlinear systems with mismatched modeling uncertainties. Int J Robust Nonlinear Control 31(8):2993–3010
Kumar M, Kumar V, Glaude H, de Lichy C, Alok A, Gupta R (2021) Protoda: Efficient transfer learning for fewshot intent classification. In 2021 IEEE Spoken Language Technology Workshop (SLT) (pp. 966–972). IEEE.
Ren Z, Zhu Y, Yan K, Chen K, Kang W, Yue Y, Gao D (2020) A novel model with the ability of fewshot learning and quick updating for intelligent fault diagnosis. Mech Syst Signal Process 138:106608
Qin F, Zheng Z, Qiao Y, Trivedi KS (2018) Studying agingrelated bug prediction using crossproject models. IEEE Trans Reliab 68(3):1134–1153
Rostami M, Kolouri S, Eaton E, Kim K (2019) Sar image classification using fewshot crossdomain transfer learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 0–0).
Liu W, Chang X, Yan Y, Yang Y, Hauptmann AG (2018) Fewshot text and image classification via analogical transfer learning. ACM Trans Intell Syst Technol (TIST) 9(6):1–20
Chen, T., Lin, L., Hui, X., Chen, R., & Wu, H. (2020). Knowledgeguided multilabel fewshot learning for general image recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence.
Gu Z, Li W, Huo J, Wang L, Gao Y (2021) Lofgan: Fusing local representations for fewshot image generation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 8463–8471).
Hong Y, Niu L, Zhang J, Zhang L (2020) MatchingGAN: Matchingbased fewshot image generation. In: 2020 IEEE International Conference on Multimedia and Expo (ICME) (pp. 1–6). IEEE.
Susto GA, Schirru A, Pampuri S, Beghi A, De Nicolao G (2018) A hiddenGamma modelbased filtering and prediction approach for monotonic health factors in manufacturing. Control Eng Pract 74:84–94
Hong Y, Niu L, Zhang J, Zhao W, Fu C, Zhang L (2020) F2gan: Fusingandfilling gan for fewshot image generation. In: Proceedings of the 28th ACM International Conference on Multimedia (pp. 2535–2543).
Yang JB, Liu J, Wang J, Sii HS, Wang HW (2006) Belief rulebase inference methodology using the evidential reasoning approachRIMER. IEEE Trans Syst Man Cyberneticspart A 36(2):266–285
Zhang B, Zhang Y, Hu G, Zhou Z, Wu L, Lv S (2020) A method of automatically generating initial parameters for largescale belief rule base. KnowlBased Syst 199:105904
Zhang A, Gao F, Yang M, Bi W (2020) A new rule reduction and training method for extended belief rule base based on DBSCAN algorithm. Int J Approximate Reasoning 119:20–39
Chang L, Dong W, Yang J, Sun X, Xu X, Xu X, Zhang L (2020) Hybrid belief rule base for regional railway safety assessment with data and knowledge under uncertainty. Inf Sci 518:376–395
Liu D, Wang H, Peng Y, Xie W, Liao H (2013) Satellite lithiumion battery remaining cycle life prediction with novel indirect health indicator extraction. Energies 6(8):3654–3668
Acknowledgements
This research was supported in part by the National Social Science Foundation of China (No. 2019SKJJC025 and 2020SKJJC033), the Space Science and Technology Innovation Foundation (No. SAST2020009), and the Natural Science Foundation of Shaanxi Province (No. 2021JQ368).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Kang, W., Xiao, J. & Xue, J. Generative knowledgebased transfer learning for fewshot health condition estimation. Complex Intell. Syst. 9, 965–979 (2023). https://doi.org/10.1007/s40747022007876
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40747022007876
Keywords
 Health condition estimation
 Transfer learning
 Generative adversarial networks
 Belief rule base
 Fewshot learning