Abstract
For practitioners, it is very crucial to realize accurate and automatic vision-based quality identification of Longjing tea. Due to the high similarity between classes, the classification accuracy of traditional image processing combined with machine learning algorithm is not satisfactory. High-performance deep learning methods require large amounts of annotated data, but collecting and labeling massive amounts of data is very time consuming and monotonous. To gain as much useful knowledge as possible from related tasks, an instance-based deep transfer learning method for the quality identification of Longjing tea is proposed. The method mainly consists of two steps: (i) The MobileNet V2 model is trained using the hybrid training dataset containing all labeled samples from source and target domains. The trained MobileNet V2 model is used as a feature extractor, and (ii) the extracted features are input into the proposed multiclass TrAdaBoost algorithm for training and identification. Longjing tea images from three geographical origins, West Lake, Qiantang, and Yuezhou, are collected, and the tea from each geographical origin contains four grades. The Longjing tea from West Lake is regarded as the source domain, which contains more labeled samples. The Longjing tea from the other two geographical origins contains only limited labeled samples, which are regarded as the target domain. Comparative experimental results show that the method with the best performance is the MobileNet V2 feature extractor trained with a hybrid training dataset combined with multiclass TrAdaBoost with linear support vector machine (SVM). The overall Longjing tea quality identification accuracy is 93.6% and 91.5% on the two target domain datasets, respectively. The proposed method can achieve accurate quality identification of Longjing tea with limited samples. It can provide some heuristics for designing image-based tea quality identification systems.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Longjing tea is a famous green tea loved by consumers from all over the world. After thousands of years of development, Longjing tea has become one of the most popular tea. Longjing tea is mainly planted in Zhejiang Province, China, and its production areas can be divided into three main geographical origins: West Lake Zone, Qiantang Zone, and Yuezhou Zone. Different geographical origins produce different subtypes of Longjing tea [1, 2]. According to various growth environments and processing techniques, including processing techniques and plucking time, Longjing tea from each geographical origin can also be classified into different quality levels [3]. From a marketability perspective, high quality equates to high prices and high profits. Therefore, it is of great significance to identify the quality of different subtypes of Longjing tea.
Traditional manual identification of tea quality is typically labor intensive, time consuming, and subjective. In recent years, image-based computerized systems using image processing and machine learning techniques have been developed to overcome these problems. Some popular classifiers, including K-nearest neighbor (KNN), random forest (RF), artificial neural network (ANN), and SVM, were used for different kinds of tea quality identification and achieved excellent results [4,5,6]. It is known that achieving accurate identification also relies on effective hand-designed features, such as color, texture, and shape [7, 8]. However, different qualities of tea tend to have minor differences in appearance, resulting in low identification accuracy of hand-designed features combined with classical machine learning methods [9].
Deep learning models have achieved great success in many computer vision tasks [10]. The most commonly used tools in deep learning include convolutional neural networks (CNNs), which are complex and efficient. CNNs have a high rate of discrimination and have proven to provide good results in precision agriculture [11, 12]. The CNN model provides an end-to-end solution to extract features and classify them with a high degree of automation. In addition, the self-learned high-level features are powerful enough to deal with many complex and high-similarity problems [13,14,15]. However, training a CNN model requires a large number of labeled images with ground truth. Concerning our situation, collecting and annotating many images from each quality level of Longjing tea and every geographical origin is undoubtedly tedious and expensive.
As a new branch of machine learning, transfer learning can take advantage of the similarities between data, tasks, or models and apply the models and knowledge learned in the existing domain (called the source domain) to the new domain (called the target domain) [16]. Using the similarity between different datasets to achieve transfer is an intuitive idea. Some labeled training samples from other available datasets can be incorporated as complementary training data using transfer learning strategy. Therefore, the current datasets may no longer require a large amount of data. In particular, Longjing tea subtypes from different geographical origins all belong to Longjing tea and have high similarities. They share common knowledge and have the potential to save much sample collection and processing work through transfer learning. Hence, combining deep learning and transfer learning can not only take advantage of deep neural networks to extract discriminative semantic features but also reduce data hunger through knowledge transfer. At present, some scholars have applied basic transfer learning strategies, such as fine-tuning pre-trained models or feature extraction, to plant disease and pest detection [11, 17, 18], fruit classification [13][19], and sheep facial expression classification [20]. The results show that transfer learning is an effective strategy for building high-performance classification models.
However, some problems hinder further development. First, current deep transfer learning is limited to learning from the knowledge in the pre-trained model by saving and adjusting parameters. The dataset used by the pre-trained model is a general large-scale visual dataset ImageNet, which has good universality but is not targeted for specific tasks, and the effect of the transfer is limited. Second, due to the data distribution gap, not every sample in the source domain is suitable for transfer learning. In some cases, ‘negative transfer’ may even occur, severely reducing the model performance [21]. Deep convolutional neural networks combined with the softmax classifier cannot filter out suitable transfer learning samples at the instance level.
To solve the problems mentioned above, we propose an instance-based deep transfer learning method for the quality identification of Longjing tea in this paper. First, the MobileNet V2 model is trained using the hybrid training dataset containing all labeled samples from source and target domains. The trained MobileNet V2 model is used as a feature extractor instead of directly using the pre-trained model. Then the multiclass TrAdaBoost algorithm is proposed for instance-based transfer learning, and valuable samples in the source domain are given higher weights to improve transferability. With the help of Longjing tea images from other geographical origins, the proposed method can accurately identify the quality of Longjing tea in the current geographical origin with limited samples.
The contributions of this paper are as follows:
-
According to the common demands of image-based tea quality identification, we build three novel Longjing tea quality datasets. Longjing tea images from three different geographical origins of West Lake, Qiantang, and Yuezhou are collected, and the tea from each geographical origin contains four grades. The Longjing tea from West Lake is regarded as the source domain, which contains more labeled samples, and the Longjing tea from the other two geographical origins contains only very limited labeled samples, which are regarded as the target domain. The tasks of all domains are the same, i.e., to realize the quality identification of tea. The constructed datasets can be used to verify the intra-domain and cross-domain classification performance of the model. As far as we know, there are few cross-domain classification datasets in the agricultural field.
-
The feature extraction capabilities of four common lightweight CNN architectures constructed using different training strategies are compared. The results show that the MobileNet V2 model trained with hybrid training datasets containing all labeled samples from source and target domains has the best feature extraction ability. The trained MobileNet V2 model is used as a feature extractor.
-
We propose a novel multiclass TrAdaBoost algorithm. It extends the original TrAdaBoost to the multiclass classification problem, maintains low computational complexity, and avoids class imbalance. The multiclass TrAdaBoost are trained with the deep features extracted from the MobileNet V2 model.
-
We explore the effect of the proposed instance-based deep transfer learning method on the performance of Longjing tea quality classification based on extensive experiments. The effectiveness of the proposed method is validated in detail.
The remainder of this paper is structured as follows: Sect. “Related works” introduces the related research works of this study. Sect. “Materials and methods” provides a detailed description of the materials used and the methods proposed in this paper. Sect. “Results and discussions” presents the comparison results and discussions. Sect. “Conclusion” summarizes the conclusions.
Related works
In this section, some related research on image-based tea quality identification, transfer learning, and boosting algorithms is introduced. Some limitations of the current study are also summarized.
Image-based tea quality identification
Tea appearance is an important attribute that can directly reflect tea quality [5]. Generally, computer vision systems (CVSs) are designed to measure the appearance of samples, which mimic the human vision process. Compared with other non-destructive testing technologies (such as electronic nose, electronic tongue, near-infrared spectroscopy, etc.), the computer vision system for image collection is easy to establish. The collection speed is fast (a sample only takes a few seconds), and the amount of information in images is rich. Many image-based related studies have been carried out, and the effectiveness of image-based methods has been validated [22]. Gill et al. [23] discriminated between four different grades of made black tea by texture features and multilayer perceptron (MLP) techniques, and 82.33% classification accuracy was achieved. Bakhshipour et al. [24] used two common heuristic feature selection methods, correlation-based feature selection (CFS) and principal component analysis (PCA), to select the most significant features. The results show that the ANN with 7-10-4 topology developed by CFS-selected features provided the best classifier with a classification rate of 96.25%.
In recent years, deep learning has provided powerful tools for research related to the tea industry. Liu et al. [25] compared the quality identification results of Chinese chrysanthemum tea products with multivariate classification models and deep learning methods. The results showed that the classification performance of the self-designed simple deep neural network significantly outperforms other multivariate classification models. Zhang et al. [26] built a 12-layer CNN for the classification of 3 kinds of tea. Data augmentation and stochastic gradient descent with momentum (SGDM) are used in the training phase. The experiments showed that a 12-layer CNN gives a good result. The sensitivities of oolong, green, and black tea are 99.5%, 97.5%, and 98.0%, respectively. Chen et al. [27] developed a CNN model named LeafNet to extract the features of tea plant diseases from images and constructed SVM and MLP classifiers. The results show that LeafNet was superior in the recognition of tea leaf diseases compared to the MLP and SVM algorithms. Kimutai et al. [28] proposed a deep learning model named TeaNet to detect the optimum fermentation of tea. The experimental results showed that TeaNet was superior in the classification tasks compared to the other machine learning techniques, including KNN, SVM, RF, and linear discriminant analysis (LDA). Kimutai et al. explored the use of the internet of things (IoT) and CNN with majority voting techniques in detecting the optimum fermentation of black tea. The deep learner recorded the highest precision and accuracy of 95.89% and 86.46%, respectively, when evaluated on real-time images.
The high-level features automatically extracted by deep CNNs are influential and representative enough to deal with the more challenging situation, such as the high similarity between different tea qualities [10, 29]. However, CNN-based deep learning models have a large number of parameters. For example, the classical residual neural network structure ResNet 50 [30] has more than 25 million parameters. The large number of parameters requires massive data for training to prevent over-fitting. The task of tea quality identification requires researchers to collect images themselves to construct the dataset. Collecting and labeling a large number of images is undoubtedly very expensive and difficult. Hence, the current obstacle to deep learning in tea quality identification is mainly the contradiction between the massive amount of data required for deep learning and manual data collection for tea quality identification. One possible solution is using transfer learning to learn general knowledge and reduce the amount of learning [31]. By combining the transfer learning strategy with the deep learning model, the transferability of deep learning is brought into play.
Transfer learning overview
Pan and Yang [16] give a classical definition of transfer learning: Given a source domain \(D_{S}\) and learning task \(T_{S}\) and a target domain \(D_{T}\) and learning task \(T_{T}\), transfer learning aims to help improve the learning of the target predictive function \(f_{T} (.)\) in \(D_{T}\) using the knowledge in \(D_{S}\) and \(T_{S}\), where \(D_{S} \ne D_{T}\), or \(T_{S} \ne T_{T}\). Pan and Yang also divide transfer learning approaches into four categories according to principles: instance-based transfer, feature-based transfer, parameter/model-based transfer, and relation-based transfer. The instance-based transfer learning approach relies on reweighting some labeled data from the source domain for use in the target domain, which is very intuitive, concise, and highly interpretable in theory. Much research work focuses on estimating the distribution ratio of the source domain and the target domain and using it as the weight of the samples [32,33,34,35,36].
Deep learning dramatically expands the scope of transfer learning and provides more possibilities. Experiments have proven that the hierarchical structures of CNNs have scalability and domain transferability [37]. Different fine-tuning methods, including extracting the output features of a particular layer, using pre-trained model parameters as initialization, and freezing or modifying the trainable parameters of particular layers, have achieved good results in many application scenarios [38, 39]. Zhu et al. [40] used the deep features from 12 CNN models to train the SVM classifier for carrot appearance recognition. The deep features of the three-layer fully connected layer of the network models (AlexNet, VGG16, VGG19) were also extracted and compared. The results showed that the accuracy of deep features with SVM was superior to the fine-tuned models. Arora et al. [41] utilized a pre-trained CNN model to achieve acrylamide identification in potato chips. The learning rate, optimization techniques, and loss function were also compared and discussed. Simulation results demonstrated that MobileNet V2 outperformed the AlexNet, ResNet-34, ResNet-101, VGG-16, and VGG-19 models. Guo et al. [42] proposed the transfer weighted extreme learning machine (TWELM) classifier to solve the class imbalance problem. Experimental results on real-world data sets show that TWELM outperforms existing algorithms on classification accuracy and computation cost. The knowledge matching strategy similar to this method also makes great contributions to solving dynamic multi-objective optimization problems [43]. The deep transfer learning method based on fine-tuning the CNN model inherits the advantages of deep learning, which can obtain high-level and powerful features and has good generalization and robustness. In recent years, Vison Transformer (ViT), as a new backbone, has made great achievements in various visual tasks. By considering the global information of the image, ViT is more competitive in some visual classification tasks [44, 45]. Scholars have proved that ViT can be used in traffic sign classification [46], plant disease detection [47], and face recognition [48]. Vit-based transfer learning begins to attract more and more attention [49,50,51].
For our task, the main characteristic is the high similarity in two aspects: the high similarity between different classes within the domain and the high similarity between the domains. Powerful features obtained by fine-tuning CNN models can distinguish different classes with high similarity. At the same time, the high similarity between domains means a similar data distribution, which is perfect for the instance-based transfer learning approach. Hence, fusing the instance-based transfer learning method with the deep learning model is an intuitive potential solution. However, the deep learning model contains the characteristic information of a large amount of data. How to associate deep learning with instance-based transfer learning methods is a major challenge. Current research rarely involves relevant aspects.
Boosting for classification and transfer learning
Boosting is a general concept of improving the learning algorithm’s performance by combining a group of ‘weak learners’ to generate a ‘strong learner’ [52]. The AdaBoost algorithm is the first and classic boosting method [53]. The relative weights of incorrectly classified samples are increased, and correctly classified samples' weights are decreased in each iteration. Essentially, AdaBoost is an instance-based learning algorithm. To extend AdaBoost to multiclass classification problems, scholars have made different improvements. AdaBoost.M1 [54] adjusted the weight update function to adapt to multiclass classification. In the previous iteration, the weights of the incorrectly classified samples remain unchanged, while the weights of the correctly classified samples decrease. AdaBoost.OC [55] was proposed to solve multiclass classification problems by combining AdaBoost and error-correcting output codes [56]. To improve the computational efficiency, Hastie et al. [57] extended the AdaBoost algorithm by stagewise additive modeling (SAMME) using a multiclass exponential loss function. SAMME has shown outstanding performance with low computational complexity and has become the current mainstream when considering AdaBoost.
Boosting-based transfer learning algorithms are instance-based transfer learning approaches that utilize labeled data from the source domain to improve the classification performance in the target domain by a reweighting strategy. Dai et al. [58] proposed the popular boosting-based transfer learning algorithm TrAdaBoost and adapted it with SVM as the base learner for two-class text classification. The main principle of TrAdaBoost is the utilization of available source data sharing some similarities with the target data and characterizing data distribution differences by reweighting. Similar samples are screened out, and negative transfer is effectively avoided. To extend TrAdaBoost to multiclass classification problems, Li et al. [59] extended the conventional TrAdaBoost for sandstone microscopic image classification by applying the one-vs.-all method. However, extending the binary classification algorithm to multiclass classifications directly through the one-vs.-all or one-vs.-one [60] method will bring about data imbalance or high computational complexity. It is not the optimal solution for the multiclass classification problem. A more efficient multiclass TrAdaBoost algorithm is urgently needed. On the other hand, the base learner also greatly influences the performance of the overall TrAdaBoost algorithm. The effect of base learner is worthy of in-depth study and exploration, and there are currently few studies comparing the effects of the base learners on their classification tasks.
Materials and methods
In this section, the details of the dataset construction and proposed methods are illustrated in appropriate subheadings.
Data collection and pre-processing
Three subtypes of Longjing tea images from the geographical origins of the West Lake Zone (Wllj), Qiantang Zone (Qtlj), and Yuezhou Zone (Yzlj) were collected and pre-processed, which formed three tea quality datasets. Every subtype of Longjing tea is divided into four grades. According to GB/T 18650-2008, the Longjing tea geographical origins are in the center of Zhejiang Province, primarily confined by 28.87–30.55°N and 118.38–121.22° E. To verify the proposed method, the three datasets have different sample sizes. The Wllj dataset is regarded as a source domain (\(D_{S}\)) with more labeled images with ground truth. The Qtlj dataset and Yzlj dataset are regarded as target domains (\(D_{{T_{1} }}\),\(D_{{T_{2} }}\)) with few labeled images with ground truth. The details of the three datasets are shown in Table 1.
The tea image samples were collected in a dark room isolated from outside light. During each collection, 5 g of tea leaves were placed on the white platform (length: 200 mm, width: 200 mm). A CCD camera (OSEECAM H1600ST, Shenzhen Weishenshidai Technology Co., Ltd., Shenzhen, China) with a 5-mm lens was used to capture images from 120 mm above the sample. A ring LED light (produced by Shenzhen Weishenshidai Technology Co., Ltd., Shenzhen, China) is placed at the same level as the lens and coaxial with the lens. It provides uniform and stable white-light illumination for all image samples. The raw images were saved in JPG format with RGB color mode. The raw image resolution is 1920 × 1080. To eliminate the influence of background and possible distortion, a 600 × 600 region of interest (ROI) was extracted by positioning the central pixel of the raw image as the center for each sample (shown in Fig. 1). The ROI image samples are used as input to the proposed model. Examples of different Longjing tea datasets are shown in Fig. 2. It can be seen that there are slight differences in the appearance of different qualities of Longjing tea. The good-quality Longjing tea has a flat and smooth appearance, the single leaf is uniform and complete, and there is little slag. As the quality decreases, the color of Longjing tea becomes uneven, the shape of single leaves is irregular, sharp edges appear, and more slag appears.
For every class in each target domain, only 10 labeled image samples with ground truth can participate in model training and validation each time. The remaining 90 samples are used for model generalization performance evaluation. The training samples in the target domain are randomly selected. Considering that manual collection and labeling of samples is time consuming and labor intensive, we keep as few labeled image samples as possible in the target domains. In this way, the significance and value of our research are more prominent.
CNN model building and feature extraction
End-to-end deep learning classification model is often the first choice to solve large-scale classification problems. However, building end-to-end CNN requires a lot of data, otherwise the classification model will fail due to over-fitting. The target domain datasets used in this study have limited annotation data, so it is not suitable for directly building a deep learning classification model. The construction of classification model needs the help of the complementary source domain and the transfer learning strategy.
Extracting features from a CNN model to adapt to specific visual tasks is a common transfer learning strategy. In this approach, input images are given to a CNN model directly. The deep features of the CNN model are extracted from a particular layer and feature vector is obtained. After the deep feature extraction, classical machine learning algorithms are applied for the classification model development. It has been successfully applied in some classification problems [61, 62]. Typically, CNN models for feature extraction are trained with ImageNet dataset, and activation values of the fully connected layers of those pre-trained CNNs are obtained. However, ImageNet, as the source domain, is very different from the tea quality identification dataset, which is not conducive to improving performance [37, 63]. Therefore, it is necessary to retrain the model based on the source and target domains in this study.
We use all labeled samples in the source and target domains and combine them by label to obtain a hybrid training dataset. The CNN model trained with the hybrid training dataset contains information from both the source domain and the target domain, and the extracted deep features can better represent the similarity and difference between classes.
Multiclass TrAdaBoost
The proposed multiclass TrAdaBoost algorithm extends the original TrAdaBoost proposed by Dai et al. [58] to the multiclass classification problem. Compared to the one-vs.-all method, the proposed algorithm has low computational complexity and avoids class imbalance. The key idea of TrAdaBoost is updating the sample weights of the source domain and target domain separately. The sample weights updating mechanism of TrAdaBoost is as follows:
Here, m is the number of samples in the source domain, and n is the number of samples in the target domain.\(w_{i}^{t}\) is the weight of sample i at iteration t, \({\varvec{x}}_{{\varvec{i}}}\) is the feature for sample i extracted from the aforementioned CNN model, \(h_{t} \left( {{\varvec{x}}_{{\varvec{i}}} } \right)\) is the predicted label, and \(y\left( {{\varvec{x}}_{{\varvec{i}}} } \right)\) is the true label. The multiplier for source domain samples is defined as \( \beta = 1{\text{ }}/\left( {1 + \sqrt {2\log m{\text{ }}/2\log m/{\text{ }}N} } \right){\text{ }} \), where N is the maximum number of iterations. The multiplier for target domain samples is defined as \(\beta_{t} = {{\left( {1 - \varepsilon_{t} } \right)} \mathord{\left/ {\vphantom {{\left( {1 - \varepsilon_{t} } \right)} {\varepsilon_{t} }}} \right. \kern-0pt} {\varepsilon_{t} }}\), where \(\varepsilon_{t}\) is the overall error of \(h_{t}\) on all target domain samples at iteration t. In the iterative process, the weights of wrongly predicted source domain samples are decreased, and the weights of wrongly predicted target domain samples are increased. In contrast, the weights of correctly predicted samples are kept unchanged. The original TrAdaBoost has shown strong transfer learning ability, although only a few labeled samples are in the target domain [64,65,66].
Similar to the critical idea of the original TrAdaBoost, we extend TrAdaBoost to multiclass classification by modifying the sample weight updating mechanism in the source domain and target domain separately. The small amount of labeled data involved in training has the same distribution as the test data for the target domain. Hence, the SAMME [57] is adopted, which is the same as the common multiclass AdaBoost:
Here, \(\alpha_{t}\) is the multiclass weight updating parameter based on the exponential loss function, which is defined as \(\alpha_{t} = \log {{\left( {1 - \varepsilon_{t} } \right)} \mathord{\left/ {\vphantom {{\left( {1 - \varepsilon_{t} } \right)} {\varepsilon_{t} }}} \right. \kern-0pt} {\varepsilon_{t} }} + \log \left( {K - 1} \right)\), where K is the number of classes. By comparing the item of \(\alpha_{t}\) and \(\beta_{t}\) above, it can be found that \(\alpha_{t}\) retains the same error calculation method as \(\beta_{t}\). For the source domain, we also adopted the SAMME as follows:
Al-Stouhi et al. [67] found that the rapid weight drop occurred in the original TrAdaBoost for correctly predicted source domain samples and proposed the correction factor \(C_{t}\) to alleviate the weight-drift effect:
By combining Eq. (2) for the target domain, Eq. (3) for the source domain, and the correction factor in Eq. (4), the modified sample weight updating mechanism for multiclass TrAdaBoost is obtained:
By incorporating the modified sample weights updating mechanism into the original TrAdaBoost, the multiclass TrAdaBoost is obtained as follows:
In Algorithm 1, labeled dataset \(T_{tar}\) and unlabeled dataset S have the same data distribution, and both belong to the target domain. The base classifier Learner can be any simple multiclass machine learning algorithm. The number of samples in \(T_{tar}\) is much less than that in \(T_{src}\) (\(n \ll m\)). In short, the multiclass TrAdaBoost utilizes a small amount of labeled data in the target domain, supplemented by a large amount of relevant source domain data, to achieve instance-based transfer learning. Our contribution is to extend the sample weights updating mechanism with the help of the SAMME algorithm and the correction factor \(C_{t}\). The rest maintains the simplicity and ease of use of the original algorithm.
![figure a](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs40747-023-01024-4/MediaObjects/40747_2023_1024_Figa_HTML.png)
The framework of the proposed approach
This research aims to achieve accurate quality identification of Longjing tea with limited training samples. With the power of transfer learning, a complementary dataset with more labeled Longjing tea images from another geographical origin is utilized as the source domain to boost the classification performance. To benefit from the available Longjing tea quality datasets from different geographical origins and minimize the negative effect of distribution dissimilarity, we design a transfer learning framework to incorporate the source domain into building the classification model, as described in Fig. 3. Dataset 1 is the source domain dataset, which acts as a complementary dataset. Dataset 2 is the target domain dataset, splitting into training dataset 2a and testing dataset 2b. The two main steps of the proposed instance-based deep transfer learning method are CNN feature extraction and multiclass TrAdaBoost. The features of each instance are extracted by the CNN model trained by dataset 1 and dataset 2a. The extracted feature vector represents a single instance and is imported into the proposed multiclass TrAdaBoost algorithm for classification. This is how the instance-based deep transfer learning method works.
The framework of the proposed approach. In training phase (a), CNN is trained using samples from the source and target domains (Dataset 1 and Dataset 2a). The feature vectors are extracted and used to train the multiclass TrAdaBoost algorithm. In the testing phase (b), the unknown samples in the target domain Dataset 2b are classified using the trained models
Performance metrics
Tea quality identification is a multiclass classification task. It is necessary to provide an appropriate evaluation that allows observation of an extensive catalog of configurations. The model performance is evaluated by accuracy, precision, recall, and F1 score, defined in Eqs. (6) to (9) [68, 69].
where TP equals true positive, TN equals true negative, FP equals false positive, and FN equals false negative. It is easy to know from the equations that the precision index describes how many of the positive results predicted by the classifier are true positives, and the recall describes how many of the true positives in the test set are picked out by the classifier. The F1 score is defined as the harmonic mean of precision and recall, which can consider both of them. Since we have four balanced classes in each dataset, macro-F1 was used for comparison. The classes of target domain datasets used in our experiments are well balanced (shown in Table 1), so the multiway accuracy is a good overall measure of accuracy [70].
All experiments were performed with ten independent runs, and the mean and standard deviation values of the results were recorded. For visualization requirements, a stacked (summed) confusion matrix was presented as a summary.
Results and discussion
Any CNN architecture can be used as a deep feature extractor. Considered the final target of transplanting the whole intelligent algorithm into embedded terminal, the computation of the CNN model should be minimized. Exploring the classification and transfer learning performance of deep features extracted by lightweight models has more academic and application value [71, 72]. Hence, some powerful and efficient lightweight CNN models are used as the backbone feature extractor in our research.
All the experiments were run on Windows 10 as the operating system on a personal computer with an 8-core Intel Core i5-8500 with 3.00 GHz CPU, 16 GB of DDR4 SDRAM and an NVIDIA GeForce GTX 1660 GPU with CUDA 10.1 and 6 GB of memory. For the implementation of the CNN model, we used PyTorch 1.9.0 [73] based on Python 3.6.13 in the backend. In the model training phase, the training dataset was augmented by rotating the images by a randomly selected angle from the set {0, 90, 180, 270}. The image samples were resized to 224 × 224, 80% of the dataset was used for training, and the rest was used for validation. When training the model, it is not necessary to consider the natural generalization ability of the model, so the test set is not applicable, and the effect of the validation set is to observe the convergence of the model. The stochastic gradient descent (SGD) optimizer and the categorical cross-entropy loss function were utilized. After some preliminary experiments, a batch size of 32 and a learning rate of 0.0001 were used as hyper-parameters. The model stopped training and the parameters were frozen after 30 epochs.
For the image pre-processing and implementation of the multiclass TrAdaBoost algorithm, the main supporting library includes OpenCV-Python 4.5, Numpy 1.19.2, and Scikit-learn 0.24.1. For the base learner in multiclass TrAdaBoost, some preliminary experiments were carried out to select the optimal hyper-parameters with the help of the Scikit-learn library. The best results are taken as the final results.
Performance of the CNN models with different training datasets
In this experiment, the CNN models were trained only with the source domain or samples from both the source and target domains. To verify the effect of the proposed method in the case of minimal labeled samples in the target domains, only ten samples in the target domains can be used for training. The remaining 90 samples are used as the test set to test the generalization ability of the models, that is, the transfer learning ability.
Different classification models and training strategies are shown in Table 2. We introduced four common and powerful lightweight network architectures, including MobileNet V2 [74], MobileNet V3-large [75], MNasNet [76] and ShuffleNet V2 [77]. We did not build the CNN models only using the target domain due to too few labeled images with ground truth. As a comparison, color and texture features based on the color histogram and gray-level co-occurrence matrix (GLCM) were extracted. The SVM classifier, which is suitable for low shot classification problems, was also trained and validated. More details and parameter settings of the comparison method can be obtained in [78]. The overall accuracy and F1 score of different methods and training strategies are also shown in Table 2. The best results are in bold.
It can be seen that hand-engineered features combined with the SVM classifier are not satisfactory. Due to too few labeled images with ground truth, the accuracy of the SVM classifier trained only using the target domain datasets is only 78.2% and 74.3%, respectively. If the source domain dataset is directly added as a supplement for training, it will destroy the generalization of the SVM classifier and get even worse results. Using lightweight CNN instead of hand-engineered feature extraction can greatly improve the classification accuracy. The classification results of different CNN models show that when the model is trained with hybrid datasets, the classification accuracy is much higher than that of the model trained only with source domain data. The capacity of the CNN models can store the classification information from the source domain and the target domain without causing the damage of generalization. Among the four lightweight CNNs, MobileNet V2 achieves the best global accuracy and F1 scores. The precision and recall values of every single class are shown in Fig. 4 and Fig. 5. Considering the visualization effect, the error bar is not displayed in the single-class precision and recall histograms. The overall accuracy values of the obtained Qtlj dataset and Yzlj dataset are 83.3% and 79.5%, respectively. For the MobileNet V2 model trained only with the source domain dataset, the classification task on the target domain is a kind of zero-shot learning (ZSL) problem. Although the overall accuracy is 64.9% and 55.5% in the two target domains, they are much higher than the accuracy of random selection (25%), which shows the transfer learning ability of the deep neural network itself. In summary, CNN model trained with the hybrid datasets from source and target domains can consider the difference in data distribution between domains and has a robust feature extraction capability.
Identification results of deep features and multiclass TrAdaBoost
In this experiment, the performance of the proposed multiclass TrAdaBoost was evaluated. The performance of conventional AdaBoost was also carried out to examine the transfer learning ability of the multiclass TrAdaBoost. Considering that MobileNet V2 model has achieved the best results in the Sect. “Performance of the CNN models with different training datasets”, the MobileNet V2 model was used as the feature extractor. We slightly modified the MobileNet V2 model to adapt it to our vision task. The classifier of the trained model was removed, and the remaining part was used as a feature extractor for feature extraction of image samples. The extracted feature dimension is 1280. The features were used as the input of subsequent classifiers. As a comparison, color and texture features (same as the features mentioned in Sect. “Performance of the CNN models with different training datasets”) were also extracted. We have fine tuned the hyper-parameter settings on the basis of [59, 79]. For both classifiers, the base learner was set as a decision tree (DT). The maximum depth of the boosting algorithm was set to 2, and the number of trees was set to 50. The learning rate was set to 1. The overall accuracy and F1 score of the different features combined with AdaBoost and multiclass TrAdaBoost are shown in Table 3. The precision and recall values of every single class of MobileNet-based methods are shown in Fig. 6 and Fig. 7, and the best results are in bold. Considering the visualization effect, the error bar is not displayed in the single-class precision and recall histograms.
It should be noted that training the MobileNet V2 model and training boosting-based classifiers are two separate steps. The ‘training data’ in Table 3 refers to the dataset composition used to build the MobileNet V2 model (the same below), while the training datasets for training the boosting-based classifiers are the combined dataset, which consists of the source domain dataset and target domain dataset (same as \(src + tar\)). At the same time, the samples used for testing are only used to evaluate the algorithm, and they are guaranteed not to participate in any training steps.
The comparative research results show that regardless of which MobileNet V2 model is used as the feature extractor, the multiclass TrAdaBoost significantly outperforms the conventional AdaBoost algorithm in overall and single-class performance. The transfer learning ability of the multiclass TrAdaBoost is fully reflected and verified. Compared with using the SVM classifier without transfer learning ability in Table 2, the use of color and texture features combined with the proposed multiclass TrAdaBoost algorithm can improve the classification accuracy. At the same time, the accuracy obtained using color and texture features is still far lower than using deep transfer learning, which fully demonstrates that the deep learning model can overcome the differences between domains to a certain extent and extract domain-invariant deep features. It is worth noting that, compared with solely using the MobileNet V2 model (shown in Table 2), the overall performance was improved after adding the multiclass TrAdaBoost algorithm. The best classification results increased from 83.3% and 79.5% to 88.1% and 83.3%, respectively. It can be concluded that the transfer learning ability of the instance-based deep transfer learning method proposed in this paper comes from the CNN models trained with hybrid dataset and the multiclass TrAdaBoost, respectively, and that they can provide positive effects in synergy.
Identification results of multiclass TrAdaBoost combined with different classifiers
In ensemble learning methods, the classification performance of ensemble classifiers depends not only on the integration strategy but also on the ability of the base learner [80]. As a boosting-based algorithm, the multiclass TrAdaBoost proposed in this paper has similar characteristics. Some scholars have found that the performance of TrAdaBoost after fusing different base learners is quite different on some classification tasks [59, 81]. Hence, it is necessary to investigate the impact of different classifiers on the algorithm's performance to obtain optimal results. In addition to the DT mentioned above, another three standard and powerful machine learning classifiers, naïve Bayes (NB), logistic regression (LR), and SVM with linear kernel, are used as base learners for the multiclass TrAdaBoost construction. For boosting algorithm, we use the same settings as in Sect. “Identification results of deep features and multiclass TrAdaBoost”. Different basic learners correspond to different hyper-parameter settings: for DT, the maximum depth is set to 2; for linear SVM and LR, the penalty coefficient C is set to 0.5, and the NB algorithm has no parameters to set. We use grid search to adjust and optimize hyper-parameters. The MobileNet V2 model trained with the hybrid datasets is used as a feature extractor. Other related parameter settings are the same as those in Sect. “Identification results of deep features and multiclass TrAdaBoost”. The overall accuracy and F1 score of every base learner are shown in Table 4. The precision and recall values of every single class are shown in Fig. 8 and Fig. 9, and the best results are in bold. Considering the visualization effect, the error bar is not displayed in the single-class precision and recall histograms.
The results of multiclass TrAdaBoost with the different base learners are variant. The transfer learning ability of CNN deep features and multiclass TrAdaBoost has been proven above, and the extracted features should be informative and discriminative. Therefore, the diversification of the results may lie in the matching between the base learner and the features. The single-class and overall performance of multiclass TrAdaBoost with NB learners is much lower than those of other methods. NB is a classification model with simple mechanisms and sound effects. However, it has high requirements for the independence of data and features, which is challenging to meet for structured, high-dimensional deep features extracted from complex CNN models [82]. Compared with NB, LR is close to its mechanism but does not highly depend on the independence of feature weights. Hence, multiclass TrAdaBoost could achieve better results, which is close to multiclass TrAdaBoost with DT. Multiclass TrAdaBoost with linear SVM learners obtains the best overall and single-class results and outperforms the other methods. SVM has strong performance in processing high-dimensional, low-shot data [83], which is suitable for our task.
Comparison results summary
Finally, the performance of all the mentioned methods in Sect. “Results and discussions” is compared. As shown in Fig. 10, the F1 score, which can represent classification performance reasonably, is selected to compare the performance of different methods. To match the data in Tables 2–4, the error bar is shown in Fig. 10. The global highest overall F1 scores of the two target domains, Qtlj and Yzlj, are 93.6% and 91.5%, respectively. The corresponding method is the MobileNet V2 feature extractor trained by the hybrid datasets and the multiclass TrAdaBoost with linear SVM learner. The stacked confusion matrix of the test set with the best performance method is shown in Fig. 11. The lowest recall values of the best method on the two target domains both appear on Grade 2, which are 89.0% and 85.6%, respectively. The recall values of other single classes are all above 93%. The misjudgment is mainly due to the high degree of similarity. For example, the appearance of Grade 2 in Qtlj and Yzlj is very similar to that of other grades. Even with the naked eye, it is difficult to distinguish the difference, which brings significant challenges to the algorithm. In summary, we can conclude that with the help of image samples of Longjing tea from other geographical origins, the proposed instance-based deep transfer learning method can accurately identify the quality of Longjing tea in the current geographical origin with limited samples.
Efficiency and performance are equally important from an application perspective. The inference time of different methods should be taken into account. The main process that affects the inference efficiency is feature extraction process (hand-engineered features or deep features) and classification process (SVM, AdaBoost, and multiclass TrAdaBoost). For fair comparison, the inference process is run on the CPU, and the single-image inference time results of each method is shown in Table 5.
Most of the time is spent in the feature extraction process, and it takes about 59 ms to manually extract the color and texture features of a single image. It is worth noting that it only takes 7 ms more to extract the deep features based on the MobileNet V2 backbone network, and there is no obvious increase in time overhead, but the performance is greatly improved (refer to Fig. 10). This undoubtedly shows once again the superiority of the lightweight network architecture MobileNet V2, which not only exerts the advantages of deep neural networks, but also ensures good real-time performance. The weight update strategy of the proposed multiclass TrAdaBoost is highly similar to the SAMME-based AdaBoost classifier, so the time consumption is basically the same. In addition, due to the different complexity of the classifier, when it is combined with multiclass TrAdaBoost as the base learner, it will also bring subtle time consumption differences. From the experimental results, the slight difference reflected in a single image is only 0.1–0.2 ms, which is basically negligible. Therefore, the best performance method (MobileNet V2 feature extractor trained with the hybrid datasets and the multiclass TrAdaBoost with linear SVM learner) can maintain good real-time performance.
Conclusion
Automatic identification of Longjing tea quality is of great significance to consumers and proprietors. In this paper, a novel instance-based deep transfer learning method for Longjing tea quality identification was proposed. MobileNet V2 was modified to match our vision task and trained using the hybrid training dataset containing all labeled samples from source and target domains. The trained model was used as a feature extractor. Then deep features from different domains were extracted by the trained MobileNet V2 model and imported into the proposed multiclass TrAdaBoost algorithm to build a classification model. To validate the proposed method, three Longjing tea quality datasets from three different geographical origins were collected, one of which contains many labeled image samples, and the other two have very limited labeled image samples. The main results are as follows:
-
(1)
The MobileNet V2 model is trained using the hybrid training dataset containing all labeled samples from source and target domains. The trained MobileNet V2 model is used as a feature extractor instead of directly using the pre-trained model. Compared with traditional image processing combined with pattern recognition methods and other lightweight CNN models, the MobileNet V2 model trained with the hybrid dataset from the source and target domains has better identification results. The CNN feature extractor can extract high-level features and maintain transferability.
-
(2)
Referencing the reweighting idea and SAMME multiclass classification strategy, an instance-based transfer learning algorithm named multiclass TrAdaBoost is proposed. The proposed algorithm can adapt to multiclass classification tasks, has lower computational complexity, and avoids data imbalance. When combined with the deep features extracted from MobileNet V2 model, the overall accuracy values of Qtlj and Yzlj reached 88.3% and 83.1%, respectively. The results show that deep features combined with multiclass TrAdaBoost can achieve a great transfer learning effect on different target domain datasets.
-
(3)
The effect of the base learner on the performance of the multiclass TrAdaBoost is also explored. The experimental results demonstrated that the deep features extracted from MobileNet V2 model combined with the multiclass TrAdaBoost with SVM learner obtain 93.6% and 91.5% accuracy for two target domain datasets, which outperforms all other methods and achieves the global optimal identification results. In addition, real-time performance is also well maintained.
With the addition of the deep features, the instance-based multiclass TrAdaBoost shows strong performance and constitutes an instance-based deep transfer learning method. In summary, we can conclude that with the help of image samples of Longjing tea from other geographical origins, the proposed instance-based deep transfer learning method can accurately identify the quality of Longjing tea in the current geographical origin with limited samples. This transfer learning method would substantially shorten the data-collecting time and save human resources. It also provides a reliable vision-based tea quality identification method for relevant personnel, even if the appearance of tea between different grades is very similar.
In future work, we plan to expand the transfer learning-based tea quality identification method. To improve the transfer learning ability, appropriate distance measures can be introduced to minimize the domain distance and achieve higher-order transfer learning (e.g., feature-based transfer learning). From the perspective of the application, we will try to embed the algorithm in a mobile end device for further testing.
Availability of data and materials
The datasets will be available at https://github.com/op99pp/Longjing_tea_dataset.
References
Hong Z, He Y (2020) Rapid and nondestructive discrimination of geographical origins of longjing tea using hyperspectral imaging at two spectral ranges coupled with machine learning methods. Appl Sci 10:1–12. https://doi.org/10.3390/app10031173
Wang X, Gu Y, Liu H (2021) A transfer learning method for the protection of geographical indication in China using an electronic nose for the identification of Xihu Longjing tea. IEEE Sens J 21:8065–8077. https://doi.org/10.1109/JSEN.2020.3048534
Wang J, Wei ZB (2015) The classification and prediction of green teas by electrochemical response data extraction and fusion approaches based on the combination of e-nose and e-tongue. RSC Adv 5:106959–106970. https://doi.org/10.1039/c5ra17978e
Li L, Xie S, Ning J et al (2019) Evaluating green tea quality based on multisensor data fusion combining hyperspectral imaging and olfactory visualization systems. J Sci Food Agric 99:1787–1794. https://doi.org/10.1002/jsfa.9371
Xu M, Wang J, Gu S (2019) Rapid identification of tea quality by E-nose and computer vision combining with a synergetic data fusion strategy. J Food Eng 241:10–17. https://doi.org/10.1016/j.jfoodeng.2018.07.020
Bakhshipour A, Zareiforoush H, Bagheri I (2020) Application of decision trees and fuzzy inference system for quality classification and modeling of black and green tea based on visual features. J Food Meas Charact 14:1402–1416. https://doi.org/10.1007/s11694-020-00390-8
Jiang H, Xu W, Chen Q (2020) Determination of tea polyphenols in green tea by homemade color sensitive sensor combined with multivariate analysis. Food Chem 319:126584. https://doi.org/10.1016/j.foodchem.2020.126584
Ghazal S, Qureshi WS, Khan US et al (2021) Analysis of visual features and classifiers for fruit classification problem. Comput Electron Agric 187:106267. https://doi.org/10.1016/j.compag.2021.106267
Hu G, Wu H, Zhang Y, Wan M (2019) A low shot learning method for tea leaf’s disease identification. Comput Electron Agric. https://doi.org/10.1016/j.compag.2019.104852
Lecun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444
Albattah W, Nawaz M, Javed A et al (2022) A novel deep learning method for detection and classification of plant diseases. Complex Intell Syst 8:507–524. https://doi.org/10.1007/s40747-021-00536-1
Albattah W, Masood M, Javed A et al (2022) Custom CornerNet: a drone-based improved deep learning technique for large-scale multiclass pest localization and classification. Complex Intell Syst. https://doi.org/10.1007/s40747-022-00847-x
Espejo-Garcia B, Mylonas N, Athanasakos L et al (2020) Towards weeds identification assistance through transfer learning. Comput Electron Agric 171:105306. https://doi.org/10.1016/j.compag.2020.105306
Hu G, Yang X, Zhang Y, Wan M (2019) Identification of tea leaf diseases by using an improved deep convolutional neural network. Sustain Comput Informatics Syst 24:100353. https://doi.org/10.1016/j.suscom.2019.100353
Peng J, Zou B, He X, Zhu C (2022) Hybrid attention network with appraiser-guided loss for counterfeit luxury handbag detection. Complex Intell Syst 8:2371–2381. https://doi.org/10.1007/s40747-021-00633-1
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22:1345–1359
Dey B, Masum Ul Haque M, Khatun R, Ahmed R (2022) Comparative performance of four CNN-based deep learning variants in detecting Hispa pest, two fungal diseases, and NPK deficiency symptoms of rice (Oryza sativa). Comput Electron Agric 202:107340. https://doi.org/10.1016/j.compag.2022.107340
Nandhini M, Kala KU, Thangadarshini M, Madhusudhana Verma S (2022) Deep Learning model of sequential image classifier for crop disease detection in plantain tree cultivation. Comput Electron Agric 197:106915. https://doi.org/10.1016/j.compag.2022.106915
Wang J, Zhang C, Yan T et al (2022) A cross-domain fruit classification method based on lightweight attention networks and unsupervised domain adaptation. Complex Intell Syst. https://doi.org/10.1007/s40747-022-00955-8
Noor A, Zhao Y, Koubaa A et al (2020) Automated sheep facial expression classification using deep transfer learning. Comput Electron Agric 175:105528. https://doi.org/10.1016/j.compag.2020.105528
Zhang W, Deng L, Zhang L, Wu D (2022) A survey on negative transfer. IEEE/CAA J Autom Sin. https://doi.org/10.1109/JAS.2022.106004
Gill GS, Kumar A, Agarwal R (2011) Monitoring and grading of tea by computer vision - a review. J Food Eng 106:13–19. https://doi.org/10.1016/j.jfoodeng.2011.04.013
Gill GS, Kumar A, Agarwal R (2013) Nondestructive grading of black tea based on physical parameters by texture analysis. Biosyst Eng 116:198–204. https://doi.org/10.1016/j.biosystemseng.2013.08.002
Bakhshipour A, Sanaeifar A, Payman SH, de la Guardia M (2018) Evaluation of data mining strategies for classification of black tea based on image-based features. Food Anal Methods 11:1041–1050. https://doi.org/10.1007/s12161-017-1075-z
Liu C, Lu W, Gao B et al (2020) Rapid identification of chrysanthemum teas by computer vision and deep learning. Food Sci Nutr 8:1968–1977. https://doi.org/10.1002/fsn3.1484
Zhang YD, Muhammad K, Tang C (2018) Twelve-layer deep convolutional neural network with stochastic pooling for tea category classification on GPU platform. Multimed Tools Appl 77:22821–22839. https://doi.org/10.1007/s11042-018-5765-3
Chen J, Liu Q, Gao L (2019) Visual tea leaf disease recognition using a convolutional neural network model. Symmetry (Basel). https://doi.org/10.3390/sym11030343
Kimutai G, Ngenzi A, Ngoga SR et al (2021) An internet of things (IoT)-based optimum tea fermentation detection model using convolutional neural networks (CNNs) and majority voting techniques. J Sensors Sens Syst 10:153–162. https://doi.org/10.5194/jsss-10-153-2021
Donahue J, Jia Y, Vinyals O et al (2014) DeCAF: a deep convolutional activation feature for generic visual recognition. In: Proceedings of the 31st international conference on machine learning. PMLR, pp 647–655
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). pp 770–778
Mehdipour Ghazi M, Yanikoglu B, Aptoula E (2017) Plant identification using deep neural networks via optimization of transfer learning parameters. Neurocomputing 235:228–235. https://doi.org/10.1016/j.neucom.2017.01.018
JIANG J, ZHAI C (2007) Instance weighting for domain adaptation in NLP. ACL 2007 Proc 45th Annu Meet Assoc Comput Linguist Prague; Czech Republic, June 23–30
Khan MNA, Heisterkamp DR (2016) Adapting instance weights for unsupervised domain adaptation using quadratic mutual information and subspace learning. In: Proceedings - international conference on pattern recognition. Institute of electrical and electronics engineers Inc., pp 1560–1565
Liao X, Xue Y, Carin L (2005) Logistic regression with an auxiliary data source. In: ICML 2005 - proceedings of the 22nd international conference on machine learning. Association for computing machinery, New York, New York, USA, pp 505–512
Tan B, Song Y, Zhong E, Yang Q (2015) Transitive transfer learning. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining. Association for computing machinery, New York, NY, USA. pp 1155–1164
Tan B, Zhang Y, Pan S, Yang Q (2017) Distant domain transfer learning. Proc AAAI Conf Artif Intell. https://doi.org/10.1609/aaai.v31i1.10826
Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks? In: Proceedings of the 27th international conference on neural information processing systems. pp 3320–3328
Coulibaly S, Kamsu-Foguem B, Kamissoko D, Traore D (2019) Deep neural networks with transfer learning in millet crop images. Comput Ind 108:115–120. https://doi.org/10.1016/j.compind.2019.02.003
Kaya A, Keceli AS, Catal C et al (2019) Analysis of transfer learning for deep neural network based plant classification models. Comput Electron Agric 158:20–29. https://doi.org/10.1016/j.compag.2019.01.041
Zhu H, Yang L, Fei J et al (2021) Recognition of carrot appearance quality based on deep feature and support vector machine. Comput Electron Agric 186:106185. https://doi.org/10.1016/j.compag.2021.106185
Arora M, Mangipudi P, Dutta MK (2021) Deep learning neural networks for acrylamide identification in potato chips using transfer learning approach. J Ambient Intell Humaniz Comput 12:10601–10614. https://doi.org/10.1007/s12652-020-02867-2
Guo Y, Jiao B, Tan Y et al (2022) A transfer weighted extreme learning machine for imbalanced classification. Int J Intell Syst 37:7685–7705
Guo Y, Chen G, Jiang M et al (2022) A Knowledge guided transfer strategy for evolutionary dynamic multiobjective optimization. IEEE Trans Evol Comput. https://doi.org/10.1109/TEVC.2022.3222844
Liu Z, Lin Y, Cao Y, et al (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/cvf international conference on computer vision. pp 10012–10022
Khan S, Naseer M, Hayat M et al (2022) Transformers in vision: a survey. ACM Comput Surv. https://doi.org/10.1145/3505244
Zheng Y, Jiang W (2022) Evaluation of vision transformers for traffic sign classification. Wirel Commun Mob Comput 2022:1–14. https://doi.org/10.1155/2022/3041117
Thai HT, Le KH, Nguyen NLT (2023) FormerLeaf: an efficient vision transformer for cassava leaf disease detection. Comput Electron Agric 204:107518. https://doi.org/10.1016/j.compag.2022.107518
Moutik O, Sekkat H, Tigani S et al (2023) Convolutional neural networks or vision transformers: who will win the race for action recognitions in visual data? Sensors 23:734. https://doi.org/10.3390/s23020734
Bao H, Dong L, Piao S, Wei F (2021) BEiT: BERT Pre-Training of Image Transformers. arXiv Prepr arXiv210608254
Atito S, Awais M, Kittler J (2021) Sit: Self-supervised vision transformer. arXiv Prepr arXiv210403602
Wang H, Pu G, Chen T (2022) A lip reading method based on 3D convolutional vision transformer. IEEE Access 10:77205–77212
Freund Y, Schapire RE (1999) A short introduction to boosting. J Japanese Soc Artif Intell 14:771–780
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55:119–139. https://doi.org/10.1006/jcss.1997.1504
Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Proceeding of international conference on machine learning. pp 148–156
Schapire RE (1997) Using output codes to boost multiclass learning problems. In: Proceeding of international conference on machine learning. pp 313–321
Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. J Artif Intell Res 2:263–286. https://doi.org/10.1613/jair.105
Hastie T, Rosset S, Zhu J, Zou H (2009) Multi-class AdaBoost. Stat. Interface 2:349–360. https://doi.org/10.4310/sii.2009.v2.n3.a8
Dai W, Yang Q, Xue G-R, Yu Y (2007) Boosting for transfer learning. In: Proceedings of the 24th international conference on Machine learning – ICML.’07. ACM Press, New York, New York, USA, pp 193–200
Li N, Hao H, Gu Q et al (2017) A transfer learning method for automatic identification of sandstone microscopic images. Comput Geosci 103:111–121. https://doi.org/10.1016/j.cageo.2017.03.007
Bishop CM (2006) Pattern recognition and machine learning. Springer
Sethy PK, Barpanda NK, Rath AK, Behera SK (2020) Deep feature based rice leaf disease identification using support vector machine. Comput Electron Agric 175:105527. https://doi.org/10.1016/j.compag.2020.105527
Bevers N, Sikora EJ, Hardy NB (2022) Soybean disease identification using original field images and transfer learning with convolutional neural networks. Comput Electron Agric 203:107449. https://doi.org/10.1016/j.compag.2022.107449
Neyshabur B, Sedghi H, Zhang C (2020) What is being transferred in transfer learning? In: Larochelle H, Ranzato M, Hadsell R et al (eds) Advances in neural information processing systems. Curran Associates Inc., Location 33:512–523
Cai L, Gu J, Ma J, Jin Z (2019) Probabilistic wind power forecasting approach via instance-based transfer learning embedded gradient boosting decision trees. Energies. https://doi.org/10.3390/en12010159
Dai M, Wang S, Zheng D et al (2019) Domain transfer multiple kernel boosting for classification of EEG motor imagery signals. IEEE Access 7:49951–49960. https://doi.org/10.1109/ACCESS.2019.2908851
Marcelino P, de Lurdes AM, Fortunato E, Gomes MC (2020) Transfer learning for pavement performance prediction. Int J Pavement Res Technol 13:154–167. https://doi.org/10.1007/s42947-019-0096-z
Al-Stouhi S, Reddy CK (2015) Adaptive boosting for transfer learning using dynamic updates. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics). https://doi.org/10.1007/978-3-642-23780-5
Tharwat A (2018) Classification assessment methods. Appl Comput Inform 17:168–192. https://doi.org/10.1016/j.aci.2018.08.003
Ahmad Loti NN, Mohd Noor MR, Chang SW (2020) Integrated analysis of machine learning and deep learning in chili pest and disease identification. J Sci Food Agric. https://doi.org/10.1002/jsfa.10987
Rad AB, Eftestol T, Engan K et al (2017) ECG-Based classification of resuscitation cardiac rhythms for retrospective data analysis. IEEE Trans Biomed Eng 64:2411–2418. https://doi.org/10.1109/TBME.2017.2688380
Li Y, Yang J (2020) Few-shot cotton pest recognition and terminal realization. Comput Electron Agric 169:105240. https://doi.org/10.1016/j.compag.2020.105240
Duong LT, Nguyen PT, Di Sipio C, Di Ruscio D (2020) Automated fruit recognition using EfficientNet and MixNet. Comput Electron Agric 171:105326. https://doi.org/10.1016/j.compag.2020.105326
Paszke A, Gross S, Massa F et al (2019) PyTorch: an imperative style, high-performance deep learning Library
Sandler M, Howard A, Zhu M et al (2018) MobileNetV2: inverted residuals and linear bottlenecks. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. https://doi.org/10.1109/CVPR.2018.00474
Howard A, Wang W, Chu G et al (2019) Searching for MobileNetV3. In: International conference on computer vision. pp 1314–1324
Tan M, Chen B, Pang R et al (2019) Mnasnet: Platform-aware neural architecture search for mobile. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR). pp 2820–2828
Ma N, Zhang X, Zheng HT, Sun J (2018) Shufflenet V2: practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on computer vision (ECCV). pp 116–131
Zhang C, Wang J, Lu X et al (2020) Recognition of types and grades of tea products based on image color and texture features. China Tea Process. 2:5–11. https://doi.org/10.15905/j.cnki.33-1157/ts.2020.02.001
He H, Khoshelham K, Fraser C (2020) A multiclass TrAdaBoost transfer learning algorithm for the classification of mobile lidar data. ISPRS J Photogramm Remote Sens 166:118–127. https://doi.org/10.1016/j.isprsjprs.2020.05.010
Xu J, Yao L, Li L (2015) Argumentation based joint learning: a novel ensemble learning approach. PLoS ONE 10:e0127281. https://doi.org/10.1371/journal.pone.0127281
Liu X, Liu Z, Wang G et al (2017) Ensemble transfer learning algorithm. IEEE Access 6:2389–2396. https://doi.org/10.1109/ACCESS.2017.2782884
Li G, Yu Y (2015) Visual saliency based on multiscale deep features. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). pp 5455–5463
Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Networks 10:988–999
Funding
This study was supported by Key R&D Program of Zhejiang Province, 2017C02007, Jin Wang, Robotics Institute of Zhejiang University, K11811, Jin Wang.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the research work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhang, C., Wang, J., Yan, T. et al. An instance-based deep transfer learning method for quality identification of Longjing tea from multiple geographical origins. Complex Intell. Syst. 9, 3409–3428 (2023). https://doi.org/10.1007/s40747-023-01024-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40747-023-01024-4