Zero shot plant disease classification with semantic attributes

Kumar, Pranav; Mathew, Jimson; Sanodiya, Rakesh Kumar; Setty, Thanush; Bhaskarla, Bhanu Prakash

doi:10.1007/s10462-024-10950-9

Zero shot plant disease classification with semantic attributes

Open access
Published: 30 September 2024

Volume 57, article number 305, (2024)
Cite this article

Download PDF

You have full access to this open access article

Artificial Intelligence Review Aims and scope Submit manuscript

Zero shot plant disease classification with semantic attributes

Download PDF

Pranav Kumar¹,
Jimson Mathew¹,
Rakesh Kumar Sanodiya²,
Thanush Setty² &
…
Bhanu Prakash Bhaskarla²

Abstract

In the rapidly evolving field of plant disease detection, the number and complexity of crop diseases are increasing, made worse by factors like climate change. Addressing these challenges requires robust and efficient methodologies capable of early and accurate disease identification. This paper explores the integration of advanced deep learning techniques, including pre-trained models, zero-shot learning, and semantic attributes to enhance the effectiveness of plant disease detection systems. High level features extracted from the images by these pretrained models capture crucial patterns, while domain-specific semantic attributes, such as leaf texture and color variations, enhance the understanding. Incorporating zero-shot learning enables adaptation to new and unseen diseases using semantic descriptions. Experimental validation across diverse plant species and disease types underscores the approach’s reliability in real-world agricultural scenarios. Our approach has demonstrated superior performance with plant village dataset, showing a significant improvement in accuracy and generalization. These results underscore the potential of our method to revolutionize plant disease detection and management in agricultural practices.

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Agriculture remains a cornerstone of the global economy, providing food and raw materials for billions of people. The identification of diseases in plant leaves is crucial for agricultural research as it directly impacts crop yield, quality, and overall food security. Leaf diseases, in particular, pose significant threats to plant health, leading to substantial losses in agricultural productivity. Early measures can be taken to decrease the impact of these diseases on crops if they are timely and precisely identified. In addition, efficient disease control techniques can support the agriculture industry and provide a consistent supply of food for expanding people.

The major challenge in agricultural disease management is to develop robust and accurate models capable of identifying diseases across various crops using minimal training data. The imbalance in datasets often leads to models that are overfitted to well-represented classes, resulting in poor generalization to new or less-represented ones. This problem is particularly relevant to plant disease categorization, as datasets may not contain enough instances of certain plant species or illnesses.

In this research, we utilized the Plant Village dataset, which includes a comprehensive collection of images of both healthy and diseased leaves from various plant species, including tomato, potato, and bell pepper as shown in Fig. 1. The goal was to simulate real-world scenarios where models trained on one crop may be applied to identify diseases in another. By focusing on how well models generalize from one type of plant to another, we aim to address the challenge of dataset imbalance and improve the flexibility and effectiveness of plant disease detection.

Traditional methods of disease identification typically involve visual inspection by experts or laboratory testing. While these methods can be effective, they are often time-consuming, laborious and require significant expertise. Because of this, adopting these methods across significant agricultural regions, a variety of plant species, and disease kinds becomes challenging. In response to these limitations, machine learning algorithms like K Nearest Neighbors (KNN) (Vaishnnave et al. 2019; Gurunathan et al. 2023; Hossain et al. 2019), k means clustering (Rohilla and Rai 2022), and Support Vector Machines (SVMs) (Es-saady et al. 2016), Zhao et al. (2010), Chanda et al. (2021) have emerged as viable alternatives for disease classification. Apart from these algorithms Random Forest classifiers (Tandekar and Dongre 2023), Saha et al. (2021) are also highly effective machine learning algorithms for improved classification. These algorithms leverage computational techniques to analyze data extracted from plant samples, such as images or spectral data, to automatically detect and classify diseases.

Convolutional Neural Networks (CNNs), a type of deep learning method, have significantly advanced plant disease identification by analyzing leaf images for detection and classification (Harshavardhan et al. 2023), Shivaprasad and Wadhawan (2023). By training on large labeled datasets, CNNs can accurately identify various leaf diseases across different plant species and environmental conditions. The method makes use of CNNs’ ability to identify complex patterns and features, enabling highly accurate disease classification.

CNNs can automate the disease identification process, resulting in quick and accurate diagnoses. This is one of the main benefits of using CNNs in this situation, as it decreases the need for expert intervention. Thotad et al. (2023); Singh et al. (2023); Lakshmanarao et al. (2021). In order to minimize crop loss and stop the spread of diseases, early detection enables prompt intervention. As a result, CNN-based real-time monitoring systems can scan crops continuously and notify farmers,leading to more sustainable and efficient agricultural practices

Transfer learning has become a key method for plant disease identification, using pre-trained models on large datasets to improve performance with limited data. This approach involves using knowledge from models that have been trained on large-scale datasets, such as ImageNet, and adapting them to the specific task of plant disease detection (Kollem et al. 2024; Chellapandi et al. 2021; Pavan et al. 2023; Gopi and Kondaveeti 2023). This allows for accurate disease identification with a much smaller number of labeled images which drastically reduces the time and resources required to create efficient models. It allows us to bypass the need of collecting and annotating huge data, which is difficult and time consuming process.

In recent years advancements in machine learning and computer vision have introduced innovative solutions to this challenge. Zero-Shot Learning (ZSL), a transfer learning technique that enables models to identify previously unseen classes without direct training data, is one potential approach. ZSL learns from seen classes and predicts unseen classes by leveraging semantic information, such as descriptions or attributes, to infer and generalize knowledge from known classes to new, unseen classes (Zhang and Zhang 2024; Lv et al. 2020; Zhen et al. 2023; Wang et al. 2018; Fu et al. 2018; Frome et al. 2013). This capability makes ZSL particularly well-suited for leaf disease classification, where new disease variants or previously unrecorded diseases frequently emerge.

Additionally, Few-Shot Learning (FSL) is another technique that has gained attention. FSL aims to train models using a very small number of examples per class, addressing situations where data is scarce or expensive to obtain. FSL (Wang and Wang 2021; Lin et al. 2022) leverages prior knowledge from related tasks or domains to generalize well from minimal examples, often using methods like meta-learning, where models learn to adapt quickly to new tasks, or transfer learning, where knowledge from a pre-trained model is fine-tuned on the few available examples. Both ZSL (Balavani et al. 2023; Singh and Sanodiya 2023; Fang et al. 2024) and FSL (Dedhia et al. 2022; Kumar et al. 2023; Nuthalapati and Tunga 2021) offer valuable solutions in scenarios where traditional machine learning methods struggle due to the lack of sufficient labeled data.

Embeddings refer to the process of mapping data from different domains into a meaningful feature space. In transfer learning embeddings represent semantic attributes as dense vectors, facilitating the model’s ability to generalize across classes by mapping similar concepts closer together in the vector space. This approach helps models infer unseen classes by leveraging semantic similarities. While existing methods often use embeddings to encode semantic attributes but may not fully capture the complex relationships between attributes and features. Our approach improves on this by concatenating semantic attributes with feature vectors, leading to richer and more descriptive representation. This method enhances generalization and accuracy, where traditional embedding methods fall short.

By incorporating Zero-Shot Learning (ZSL) into our framework, we address the issue of dataset imbalance and improve the model’s ability to generalize across various plant species and disease types. For instance, when the model is trained on healthy tomato leaves, it learns broad characteristics of leaf health-such as texture, color, and shape-that extend beyond a specific plant species. This enables the model to predict the health of potato leaves even without direct exposure to them during training. Similarly, the model’s knowledge of late blight on tomatoes can be generalized to detect similar symptoms in other plants, like potatoes. ZSL leverages semantic attributes to transfer knowledge from seen to unseen classes, effectively mitigating the impact of dataset imbalance and enhancing disease management across different plant species.

This research paper explores the application of Zero-Shot Learning in the context of leaf disease classification. By combining semantic attributes, which are pieces of information, with features for classification, we investigate how ZSL can be used to enhance the accuracy and efficiency of disease detection in plants, even when dealing with limited or no labeled data for certain diseases. ZSL presents a viable substitute for conventional supervised learning techniques, with the potential to revolutionize the way in which farmers and agricultural experts manage plant health.

In this work, we aim to address the following key objectives:

1. Develop a Zero-Shot Learning (ZSL) framework for plant leaf disease identification that leverages both semantic attributes and visual features.

2. Enhance the generalization capabilities of the ZSL model to accurately classify unseen plant leaf diseases by integrating and optimizing the use of semantic and visual features, utilizing different pre-trained models to extract robust feature representations.

3. Investigate the impact of various pre-trained models on the accuracy and generalization ability of the ZSL approach, identifying the most effective models for plant disease classification tasks.

2 Related work

Here, we examine related works on plant disease classification, specifically focusing on methods that align with our proposed framework. We explore various methodologies for feature extraction, emphasizing those designed to address the limitations of disease classification.

2.1 Traditional methods approaches

For instance, Garg et al. utilized a K-Nearest Neighbors (KNN) classifier (Garg et al. 2022) for the automatic detection of plant diseases, achieving notable accuracy in her results. The KNN classifier, a straightforward yet powerful machine learning algorithm, operates by comparing new, unseen data points to the most similar points in the training dataset, identified as the ’k’ nearest neighbors. This method is effective because of its ease of use and capacity to manage multi-class classification issues for the identification of plant diseases.

Support Vector Machines (SVMs) have also proven to perform exceptionally well in automatic disease detection in plant leaves, demonstrating robustness and high accuracy across various datasets and disease types. In the article (Ramanathan et al. 2023), the author employed a novel approach using Butterfly Optimization (BO) for feature extraction in conjunction with Support Vector Machines (SVMs). Kumar et al. proposed a framework that utilizes k-means segmentation combined with multi-class SVM-based classification in four stages, achieving an accuracy efficiency of 95.7% (Kumar et al. 2020).

2.1.1 Deep learning approaches

Deep learning has emerged as a pivotal technology in the field of plant disease detection and diagnosis. Its ability to automatically extract and learn features from large datasets makes it particularly well-suited for analyzing complex patterns in plant health. Sarkar et al. (2023) suggested a new strategy that uses DenseNets with SVMs to combine the advantages of deep learning and traditional machine learning methods. This hybrid model is a potential approach for plant disease detection and classification since it increases both accuracy and generalization.

In Belmir et al. (2023), Belmir et al. proposed a deep CNN model for plant disease classification using the PlantVillage dataset. This model achieved a training accuracy of 98.01% and a test accuracy of 94.33%, showing its efficiency in identifying and categorizing leaf diseases early on, which is essential for maintaining agricultural production. Joseph et al. (2023) proposed a novel approach of using Fusing Deep CNN with Local Binary Pattern (LBP) techniques for enhanced image analysis and feature extraction in their research. Here they have also added predictive analysis system which predicts if there is any possible infection. Militante et al. (2019) used deep learning models and got accuracy of 96.5%.

Showrav et al. (2022) proposed a two-stage classification scheme for plant disease detection, addressing the issue of species-specific symptoms. First, it detects plant species; second, it uses efficient CNN architectures such as EfficientNetB3 and NASNetLarge with transfer learning to identify diseases specific to those species. Tested on the Plant Village and IPM datasets, this strategy performs more effectively than traditional single-stage techniques. Bakshi and Goel (2023) performed a computational analysis of multi class plant disease diagnosis using Logistic Regression classifier on three models like Resnet-50, Densenet-161 and InceptionNet-V3 out of which Resnet-50 performed well with accuarcy of 96.9%.

Guowei Dai et al. proposed a novel deep learning model (PPLC) (Dai et al. 2023) that integrates dilated convolutions, multi head attention and Global Average Pooling(GAP) layers. Additionally, the CBAM attention mechanism is incorporated into the middle layer to enhance the model’s information representation. This model achieved an accuracy of 99.702% and an F1 score of 98.442%.In Pal and Kumar (2023), the authors propose the AgriDet framework, combining Inception-Visual Geometry Group Network (INC-VGGN) and Kohonen-based deep learning for plant disease detection. The framework addresses issues like occlusion and overfitting, utilizing advanced image pre-processing and a pre-trained INC-VGGN model, achieving superior accuracy and sensitivity.

In the article(Dai et al. 2024), authors proposed DFN-PSAN network, which combines the YOLOv5 Backbone with pyramidal squeezed attention (PSA) for effective plant disease classification in natural field environments. DFN-PSAN achieves over 95.27% accuracy and F1-score, while the PSA mechanism reduces model parameters by 26%. Additionally, t-SNE with SHAP interpretable methods enhances the transparency of model attention features.

2.1.2 Transfer learning and domain adaptation approaches

Transfer learning has proven instrumental in leveraging pre-existing knowledge from one domain to enhance learning and performance in another. Degadwala et al. (2023) proposed a pioneering approach that uses transfer learning with state of art CNN’s for the classification of hop plant diseases. This article highlighted how pre-trained models can improve agricultural disease detection efficiency and accuracy. Tumpa and Halder (2023) had used 6 different pretrained models for a comparative work, models inlcuding VGG16, ResNet50,DenseNet121, MobileNetV2, Xception, and InceptionV3. In this reasearch Xception model outperformed other models with accuracy of 98.41% and 0.079 loss value.

Tunio et al. (2021) proposed a hybrid deep CNN transfer learning approach using rice plant images. He employed transfer learning to develop a deep learning model utilizing the Rice Leaf Dataset from a secondary source, achieving an accuracy of 90.8%. Rani and Gowrishankar (2023) used Agri-ImageNet dataset and experimented transfer learning for 38 deep transfer learning models. Among these models, EfficientNetV2B2 and EfficientNetV2B3 achieved the highest accuracies, with 90.3% and 92.4% respectively. Anand et al. (2023) performed transfer learning to four distinct models like AlexNet, VGG16, MobileNetV2, and InceptionV3 where InceptionV3 outperformed all the other models with accuracy of 0.92 and precision of 0.84.

Rahim et al. (2023) leveraged CNN’s with transfer learning and used pretrained models like VGG16, Inception-V3, VGG19, and ResNet-50, additionally he also used traditional methods like KNN, SVM, AdaBoost, Decision Tree, and Random Forest. In this article VGG-19 achieved accuarcy of 98% and random forest of 96.6% exceeding other models.

Table 1 Comparison work on different methods with various models

Full size table

The Table 1 provides a comparison of various methods and models used in plant disease classification, highlighting their datasets, models, testing on unseen classes, and the integration of semantic attributes. A comparative analysis is conducted using models like VGG16, ResNet50, EfficientNet, AlexNet, Inception v3, YOLO v3, YOLO v4, and MobileNetv3. Our proposed method stands out by uniquely integrating semantic attributes, which enhances the model’s ability to generalize and perform accurately on unseen classes.

2.1.3 Zero shot transfer learning approaches

Romera-Paredes and Torr (2015) introduced a streamlined zero-shot learning method using a single-line implementation. This approach involves training with a signature matrix to derive a V matrix, enabling inference during testing using semantic attributes. In the paper (Han et al. 2024), Han et al. introduced a dual relation mining network featuring a dual attention block designed for visual semantic relationship extraction. Additionally, they employed Semantic Interaction Transformer (SIT) for enhancing attribute representations across images, thereby improving generalization capabilities.

However, semantic attribute models do not capture the intricate manner in which humans perceive and recognize elements within images. To address this limitation, proposed a novel method for zero-shot image classification using human gaze data as auxiliary information. Gaze estimation predicts where humans direct their attention in an image, offering valuable insights for attribute localization. Liu et al. (2021) introduced a Gaze estimation framework consisting of three modules: Attention Module, Attribute Localization, and Attention Transition. In this framework, task-dependent attention is learned using the goal-oriented GEM, while global image features are concurrently optimized through the regression of local attribute features. Karessli et al. (2017) introduced three gaze embeddings: Gaze Histograms (GH), Gaze Features with Grid (GFG), and Gaze Features with Sequence (GFS). This paper also presents a key equation pertaining to the Structured Joint Embedding (SJE) model for zero-shot learning.

Han et al. (2021) proposed a contrastive embedding (CE) approach for their hybrid GZSL framework, which leverages both class-wise and instance-wise supervision. Traditional ZSL approaches often encounter domain bias issues, where generated features for unseen classes may not accurately reflect their true distributions. To address this, Liu et al. (2022) proposed a VAE-based framework called Joint Attentive Region Embedding with Enhanced Semantics (AREES). This framework is designed to improve zero-shot recognition by simultaneously optimizing feature extraction and feature generation, enhancing the alignment between visual and semantic features.

Using graph embeddings, Naeem et al. (2021) suggested compositional zero-shot learning with the goal of identifying previously unknown combinations of states and objects based on visual primitives observed during training. The method makes use of the relationships that exist between states, objects, and their compositions inside a graph structure to help transfer information from compositions that are seen to compositions that are not. Liu et al. (2021) introduces an iterative co-training framework comprising two distinct base ZSL models and an exchanging module. Additionally, it features a semantic-guided OOD detector designed to identify the most likely unseen-class samples before class-level classification, addressing the bias problem in GZSL.

3 Methodology

This methodology leverages the strengths of both State of Art CNN models and expert-defined semantic attributes to enhance the accuracy and interpretability of plant disease classification. By combining high-level features with domain-specific knowledge, our approach aims to provide a robust and reliable solution for early disease detection in plants.

Table 2 State of the Art CNN Models

Full size table

3.1 Feature extraction using pretrained model

In this work we have used different pretrained models for feature extraction.The pretrained models are chosen based on its high performance in related tasks, is fine-tuned to suit the specific needs of plant disease classification (Fig. 2). The steps involved are:

1.
Model selection In the field of plant disease classification, where obtaining a large amount of labeled data can be challenging, transfer learning with pre-trained Convolutional Neural Networks (CNNs) is vital. Utilizing models pre-trained on extensive datasets like ImageNet allows for more effective classification, even with limited data. To assure coverage of many situations for computing and deployment demands, we have chosen a wide range of models. For resource-constrained environments, lightweight models like MobileNetV2 and ShuffleNetV2 are ideal because of their efficiency and low computational demands. In contrast, when computational resources are abundant, more complex models like VGG19, ResNet50, and Vision Transformer (ViT-B16) excel by capturing intricate patterns in the data, leading to enhanced classification accuracy. Additionally, models like DenseNet121, EfficientNet, and GoogleNet offer a balanced approach for scenarios with moderate resource constraints, combining efficiency with strong performance. By utilizing a diverse range of pre-trained models, from lightweight to complex, we ensure a flexible and comprehensive approach to plant disease classification. This allows us to compare their performance and identify the best model for various resource scenarios, ensuring high accuracy and effectiveness. A detailed description of every selected model is provided in Table 2.
2.
Image preprocessing The plant images were preprocessed to match the input requirements of the selected pre-trained models. This involved resizing the images to the specific dimensions required by each model, normalizing pixel values, and applying any necessary augmentations to enhance the dataset’s diversity and robustness.
3.
Feature extraction The pre-processed images were fed into selected pre-trained models, and features were extracted from the final pooling or fully connected layers, depending on the architecture. These features, which capture high-level representations of the images, served as the initial input for our classification model. Specifically, the last fully connected layers, which are used for classification in the pre-trained models, were removed to obtain these features.

3.2 Incorporation of semantic attributes

1.
Semantic attribute definition Semantic attributes relevant to plant diseases were defined in collaboration with domain experts. These attributes included characteristics such as leaf color, shape, texture, presence of spots, and other visually distinguishable symptoms of diseases.
2.
Selection of semantic attributes The selection of semantic attributes for plant disease classification is based on its biological significance and their direct correlation with visible symptoms of plant diseases. Each attribute provide insights in health of plant and nature of disease. For instance, attributes like elliptical shape and lanceolate shape help identify specific types of infections by showing particular patterns on the plant leaves. Green color and its deviations (e.g., yellowing, brown color) indicate plant health and potential infections. Discoloration and deformation degree signal early stress or disease progression. Concentric rings and rust texture are signs of specific fungal or bacterial infections, while spots smaller and spots larger indicate the stage of infection. Analyzing these attributes helps in accurate disease diagnosis and intervention. Table 3 provides in detail significance of selection of the attribute.
3.
Annotation process Plant Village Dataset is annotated with these semantic attributes. Each plant image was manually labeled with the corresponding attributes by trained annotators, ensuring high-quality and accurate annotations. This includes detailed annotations based on a set of binary semantic attributes. Each plant image is labeled by trained annotators with attributes such as elliptical shape, lanceolate shape, ovate shape, green color, brown color, discoloration, deformation degree, size reduction, concentric rings, rust texture, smaller spots, larger spots, and yellowing. For instance, an image of a Tomato Late Blight leaf and Potato Early Blight might be annotated as shown in Figure 3.
4.
Attribute vector creation For each image, a vector of semantic attributes was created. This vector represented the presence or absence (or degree) of each defined attribute, resulting in a comprehensive attribute representation for each image.

Table 3 Detailed Significance of selected Semantic Attributes

Full size table

3.3 Feature concatenation

The feature vectors extracted from the pre-trained models were merged with the semantic attribute vectors. This combined vector incorporated both the deep learning features and the human-defined semantic attributes, offering a more comprehensive representation of each image. The concatenated feature vectors served as input to the Combined Classifier, a neural network model specifically designed for this task which contains two fully connected layers.

3.4 Formulation of loss function

Here, we are using cross-entropy loss to quantify the dissimilarity between the predicted class probabilities and the actual class labels. Given N as the number of samples and C as the number of classes, the loss is computed as:

$$\begin{aligned} \mathcal {L}(\textbf{y}, \mathbf {\hat{y}}) = -\frac{1}{N} \sum _{i=1}^{N} \sum _{j=1}^{C} y_{ij} \log (\hat{y}_{ij}) \end{aligned}$$

(1)

where yij is the actual class label for sample i and class j (1 if the sample belongs to class j, 0 otherwise). $\hat{y}_{ij}$ is the predicted probability that sample i belongs to class j.

The cross-entropy loss function calculates the negative log likelihood of the true labels given the predicted probabilities. A lower cross-entropy loss indicates that the predicted probabilities are closer to the true labels, meaning the model is performing better.

3.5 Training the model

In this section, we detail the implementation of our framework using Stochastic Gradient Descent (SGD) for training. The objective is to optimize the neural network’s weights by minimizing the loss function, which measures the disparity between predicted and actual class labels. This process involves updating the parameters $\Theta$ iteratively via backpropagation.

$$\begin{aligned} \Theta ^{t+1} = \Theta ^{t} - \eta \frac{\partial \mathcal {L}}{\partial \textbf{x}_i} \end{aligned}$$

(2)

where $\eta$ represents learning rate.

4 Experiments

4.1 Setup

Our experiment utilized the PyTorch framework, employing pretrained models from the PyTorch library for effective feature extraction and streamlined training. The experiments were conducted using a system equipped with NVIDIA GPUs and CUDA acceleration, ensuring sufficient computational power and memory to effectively manage the complexities of our models and the large datasets involved in our research.

4.1.1 Dataset

For this experiment we utilized a subset of the Plant Village dataset, which contains around 20,600 images categorized into 15 classes. These classes encompass a variety of plant leaves, including those from potatoes, tomatoes, and pepper bell covering both diseased and healthy specimens as summarised in table 4.

Table 4 Summary of the selected subset from the Plant village dataset

Full size table

4.1.2 Baseline method

The baseline method for this work involves extracting features from a pretrained model and classifying them without concatenating with semantic attributes. Here in this method we have taken two types of data splits one with three and four classes, Where we performed training on tomato and testing on potato leaves with classes Healthy, LateBlight, EarlyBlight. We also added BacterialSpot of pepper for training in the second split and evaluated.

4.1.3 Implementation details

The experiment was implemented using frameworks and libraries such as pytorch. The pretrained models used for experiment are VGG16, ResNet50, ViT-B/16, VGG19, ResNet18, AlexNet, GoogLeNet, DenseNet121, EfficientNet-B0, MNASNet1.0, MobileNetV2, and ShuffleNetV2 x0.5.

During the implementation, we concatenated semantic attributes along with features extracted by the pre-trained models to help the model learn more effectively. This integration aimed to leverage both the rich feature representations from the pre-trained models and the additional contextual information provided by the semantic attributes. By doing so, we sought to enhance the model’s performance and accuracy in classifying the unseen potato classes.

We conducted an experiment using two different data splits: one with three classes and the other with four classes. For the first data split, which included three classes-healthy, early blight, and late blight we trained the model on tomato leaves and tested it on potato leaves, and vice versa: training on potato leaves and testing on tomato leaves. We repeated the experiment with a second data split that included four classes by adding another class: bacterial spot.

4.1.4 Parameters

The training was conducted using the Adam optimizer with a learning rate of 0.001, a batch size of 32, and over 100 epochs. The primary evaluation metric was classification accuracy on the unseen classes, supplemented by cross-entropy loss.

For semantic information, we defined a 13-dimensional attribute vector for each class, with example attribute vectors including [0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0] for Tomato Healthy and [0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1] for Potato Late Blight. We also reduced the number of semantic attributes to evaluate the model’s performance with different attribute sets.

In addition to evaluating with the full set of attributes, we also performed experiments with reduced attribute sets of 9 and 7 attributes. This approach helps us assess how varying the number of semantic attributes affects the model’s performance. By comparing results with different attribute configurations, we can better understand the trade-offs between model complexity and accuracy, and determine the optimal number of attributes for balancing semantic richness and computational efficiency.

4.2 Experimental results

In this section, we present the results of our framework (Zero shot Transfer Learning), using various pre-trained models to compare their performance with and without the inclusion of semantic attributes (SA). Our objective is to evaluate the impact of integrating semantic attributes on classification accuracy and other relevant metrics.

For the first experiment, we used three classes: Healthy, Early Blight, and Late Blight. We trained the models on Tomato leaves and tested them on Potato leaves, and then reversed the process by training on Potato leaves and testing on Tomato leaves. The results of both experiments are presented in Table 5.

Table 5 Accuracies of state-of-the-art models in the ’Tomato vs Potato’ and ’Potato vs Tomato’ experiments with 3 classes: healthy, early blight, late blight

Full size table

Table 6 Accuracies of state-of-the-art models in the ’Tomato vs Potato & Pepper Bell’ and ’Potato & Pepper Bell vs Tomato’ experiments with 4 classes: healthy, early blight, late blight and bacterial spot

Full size table

“For the second experiment, we employed four classes: Healthy, Early Blight, Late Blight, and Bacterial Spot. We initially trained the models on Tomato leaves and tested them on potato and Pepper Bell leaves. Subsequently, we reversed the experiment by training on Potato and Pepper Bell leaves and testing on Tomato leaves, and both the results are summarized in Table 6".

4.3 Experimental analysis

In our work, we conducted experiments across different source and target domains, performing classifications with and without semantic attributes (SA) across various numbers of classes. The results clearly demonstrate that incorporating semantic attributes consistently enhances model performance.

Table 5 illustrates the model performance for three-class classifications (Tomato vs Potato and Potato vs Tomato) when semantic attributes are included. In the table “SA” refers to Semantic Attributes and "w/o SA" refers to without Semantic Attributes.

For instance, over all epochs, models like as ResNet18 and ResNet50 show significant accuracy increases when utilizing Semantic Attributes in both the Tomato vs. Potato experiment and the reverse scenario. At 100 epochs, ResNet18 achieves an accuracy of 99.95% with SA compared to 62.73% without SA, and GoogleNet reaches 99.81% with SA, up from 68.14% without SA for Tomato vs Potato classification. These results suggest that semantic attributes significantly enhance the model’s ability to differentiate between the two plant species, leading to higher classification accuracy. This trend is observed across most models, with notable improvements in Resnet50, Vit-B16, and EfficientNet-B0. These findings suggest that deeper and more complicated neural networks benefit from the addition of semantic features.

In the more complex four-class scenario (Tomato, Potato, and Pepper Bell), the inclusion of semantic attributes continued to enhance model performance as illustrated in Table 6. For example, the ViT-B16 model showed a marked improvement in accuracy with SA, achieving 69.45% at 100 epochs for the Tomato vs. Potato, Pepper Bell classification, up from 49.01% without SA. In the four-class classification, ResNet50 reached 68.11% accuracy with SA at 100 epochs compared to 45.04% without SA. Importantly, semantic attributes also greatly improved the performance of models like EfficientNet-B0 and MobileNet-V2, demonstrating their value in enhancing model adaptability and generalization.

The significant difference in effectiveness between the three-class and four-class scenarios can be attributed to several factors. Adding Pepper Bell to the classification introduces additional complexity, as the model must now distinguish among three types of plant diseases rather than two. This increased complexity makes it more challenging for models to accurately classify all classes. Semantic attributes enhance performance by providing extra contextual information, but the improvement might be less pronounced due to the increased number of categories. Deeper models like ResNet18, ResNet50 and ViT-B16 benefit more from semantic attributes due to their ability to utilize additional features effectively. Class imbalance issues in the four-class scenario also impact performance, though semantic attributes help mitigate these effects.

To further understand the impact of semantic attributes on the model’s learned features, we visualized the feature space using t-SNE plots for the source domain, as shown in Figure 4. The left plot illustrates the feature distribution before training, where there is significant overlap among the three classes, indicating poor initial separation. In contrast, the right plot shows the feature distribution after training with semantic attributes. It is evident that the classes are more distinctly clustered, with well-defined boundaries between them. This demonstrates that the incorporation of semantic attributes significantly enhances the model’s ability to learn discriminative features, leading to better class separation and reduced confusion between categories.

The clearer clustering of classes in the t-SNE plot after training indicates that the model effectively utilizes semantic information to distinguish between different plant species, aligning with our observations of improved classification performance in the source domain.

The confusion matrices for these experiments further reinforce these observations. Confusion matrices reveal that models with semantic attributes exhibit fewer misclassifications, with the number of false positives and false negatives significantly reduced compared to models without semantic attributes. This indicates that semantic attributes provide valuable context that improves the model’s ability to distinguish between plant species, leading to a clearer separation of classes and a reduction in classification errors. Figure 5 illustrates the difference between confusion matrices with and without Semantic Attributes (SA) for the ResNet-18 model when classifying three classes (Table 7).

Table 7 Comparison of proposed approach with existing state-of-the-art methods

Full size table

When experimented on three-class classification, our proposed method, which combines ResNet18 with Semantic Attributes (SA), achieves an outstanding accuracy of 99.81%, significantly outperforming all other existing methods. This result is notably higher than the next best-performing method, Inceptionv3 with Transfer Learning, which achieved an accuracy of 88.62% as reported by Morellos et al. (2022). Other methods, such as EfficientNet with Few-Shot Learning and MobileNet-V2 with Convolutional Attention (CA), recorded accuracies of 83.22% and 68.56%, respectively.

The intuition behind the proposed approach lies in its ability to enhance disease detection by integrating semantic attributes with image features from advanced models like CNNs and Vision Transformers (ViTs). By combining detailed image data with rich descriptive information about the diseases, this approach provides a more comprehensive understanding, enabling accurate classification of both known and unseen diseases.

In contrast, existing state-of-the-art methods utilize different strategies. Inceptionv3 with Transfer Learning leverages pre-trained models to utilize extensive learned features for robust classification. EfficientNet with Few-Shot Learning adapts to new diseases from limited examples, balancing model efficiency with minimal data. MobileNet-V2 with CA refines features using a lightweight model and contextual attention. While these methods excel in their specific areas, they do not integrate descriptive disease information as effectively as our approach, which results in superior accuracy.

Despite these advances, practical scenarios can still present challenges for zero-shot learning (ZSL). To address generalization issues, incorporating additional techniques such as data augmentation, class weighting, and resampling can be beneficial. These methods can help refine the model’s ability to generalize and classify unseen plant diseases with greater accuracy and reliability.

4.4 Convergence analysis

To obtain a comprehensive understanding of the convergence patterns displayed by various models across different scenarios, we utilized detailed visualizations, as shown in Fig. 6. These visualizations depict the training and testing accuracies over time. Specifically, the graphs present the number of epochs on the X-axis, while the Y-axis shows both the training and testing accuracies. By analyzing these graphs, we can observe how top models perform and converge under various conditions, providing valuable insights into their trends and behaviour.

The analysis of the provided graphs reveals distinct convergence patterns and trends for the ResNet18 and ViT-B16 models, comparing the performance using 13 semantic attributes. For resnet18 as shown in graph (a), both the training and test accuracies converge quickly, with the training accuracy reaching 100% and the test accuracy stabilizing around 95% within the first 10 epochs. The loss for resnet18 also decreases rapidly, approaching zero early in the training process and remaining stable thereafter as depicted in graph (c). This indicates that resnet18 fits the training data exceptionally well, given the near-perfect training accuracy and minimal loss.

The vit-b16 model demonstrates a more variable convergence pattern. The training accuracy quickly reaches 100%, but the test accuracy fluctuates between 85% and 90%, with noticeable drops around epochs 60 and 80 as shown in graph (b). The training loss also decreases rapidly but exhibits significant spikes, suggesting instability during training in graph (d). These fluctuations in test accuracy and loss suggest that ViT-B16 responds dynamically to different training conditions. This highlights the potential for enhancing the model’s performance through careful tuning of parameters such as learning rate schedules and batch sizes.

From the graph 7, "Comparison of Test Accuracy for 13 vs 9 vs 7 Attributes," it is evident that models generally perform better with a moderate number of semantic attributes. In particular, test accuracy is highest with 9 attributes and lowest with 7 attributes for most models. For instance, models such as VGG16, ResNet50, and GoogLeNet exhibit a noticeable increase in accuracy when using 9 attributes compared to 13 or 7 attributes. This improvement suggests that removing certain common attributes, such as green color, which is present in nearly all healthy leaves, helps the models to focus on more distinguishing features. As a result, models achieve better generalization and produce more accurate projections. This trend is particularly evident in models like VGG16, DenseNet121, ShuffleNet-v2, where the difference in accuracy between using 9 attributes and 13 attributes is quite pronounced, highlighting the importance of selecting the right attributes to improve model performance.

The above graph provides a comparative analysis of test accuracy for different models when evaluated using 13, 9, and 7 semantic attributes.

In conclusion, the analysis indicates that selecting an optimal number of semantic attributes, such as 9, significantly enhances the performance of various deep learning models in plant disease classification. Models exhibit higher accuracy with 9 attributes compared to 13 or 7 attributes. This improvement is due to the removal of attributes that are common across all classes, such as the green color, which helps in reducing noise and redundancy. The experiment underscores the significance of selecting a balanced set of attributes to achieve the best performance in plant disease classification tasks, allowing models to extract more relevant features and make better decisions.

5 Conclusion and future works

Our paper introduces a novel approach that leverages semantic attributes by concatenating these attributes with the features extracted by the model for the classification. Addition of semantic attributes to the feature set enabled the access of more contextual information for classification. Through the integration of semantic features, we observed an increase in performance and accuracy. Our approach involves using different pretrained models like ResNet, GoogleNet, VitB16, ensuring a comprehensive evaluation on different architectures. Our work demonstrates that this strategy effectively enhances model performance across a variety of trials and model designs. Additionally we provide a detailed analysis of impact of semantic attributes on classification, highlighting their significance in models performance.

While this method of using using semantic attributes is providing good results but there are several other approaches for future exploration and implementation to consider. One potential approach is integrating semantic attributes with multimodal data, combining visual and textual features to enhance model performance. By incorporating self attention layer into pretrained models, which helps in capturing dependencies at different levels and enhance feature extraction. Other approaches like use of semantic attributes in fewshot learning metrics. Our approach can also be supplemented with various data augmentation methods to generate more data and make the model robust in practical scenarios.

Advanced data augmentation methods should also be explored, as techniques like rotation, scaling, and color adjustments can produce more diverse training samples and enhance model robustness. Additionally, integrating Generative Adversarial Networks (GANs) could further improve performance by generating synthetic samples to enrich the dataset. These approaches can complement the success of semantic attributes, leading to more effective solutions in plant disease classification and precision agriculture.

References

Anand V, Gupta S, Gupta R, Alshahrani H, Hamdi M, Al Reshan MS, Shaikh A (2023) Plant disease diagnosis using transfer learning based models. Congress in computer science, computer engineering, & applied computing (CSCE). IEEE, New York City, pp 337–340
Google Scholar
Andrianto H, Faizal A, Armandika F, et al (2020) Smartphone application for deep learning-based rice plant disease detection. In: International conference on information technology systems and innovation (ICITSI). IEEE, pp 387–392
Bakshi G, Goel S (2023) Computation analysis of multi-class plant disease diagnosis using deep learning models. In: 2023 IEEE 2nd international conference on industrial electronics: developments & applications (ICIDeA), IEEE, pp 597–602
Balavani K, Sriram D, Shankar MB, Charan DS (2023) An optimized plant disease classification system based on resnet-50 architecture and transfer learning. In: 4th international conference for emerging technology (INCET). IEEE, pp 1–5
Belmir M, Difallah W, Ghazli A (2023) Plant leaf disease prediction and classification using deep learning. In: 2023 international conference on decision aid sciences and applications (DASA), IEEE, pp 536–540
Chanda PB, Sarkar SK (2021) Effective classification of plant disease using image processing and machine learning. Innovations in power and advanced computing technologies (i-PACT). IEEE, New York City, pp 1–7
Google Scholar
Chellapandi B, Vijayalakshmi M, Chopra S (2021) Comparison of pre-trained models using transfer learning for detecting plant disease. In: 2021 international conference on computing, communication, and intelligent systems (ICCCIS), IEEE, pp 383–387
Chen J, Zhang D, Zeb A, Nanehkaran YA (2021) Identification of rice plant diseases using lightweight attention networks. Expert Syst Appl 169:114514
Article Google Scholar
Dai G, Fan J, Tian Z, Wang C (2023) Pplc-net: neural network-based plant disease identification model supported by weather data augmentation and multi-level attention mechanism. J King Saud Univ Comput Inf Sci 35(5):101555
Google Scholar
Dai G, Tian Z, Fan J, Sunil C, Dewi C (2024) Dfn-psan: multi-level deep information feature fusion extraction network for interpretable plant disease classification. Comput Electron Agric 216:108481
Article Google Scholar
Dedhia K, Konkar M, Shah D, Tawde P (2022) A proposed algorithm to perform few shot learning with different sampling sizes. In: 2022 IEEE fourth international conference on advances in electronics, computers and communications (ICAECC), IEEE, pp 1–5
Degadwala S, Vyas D, Panesar S, Ebenezer D, Pandya DD, Shah VD (2023) Revolutionizing hops plant disease classification: harnessing the power of transfer learning. In: 2023 international conference on sustainable communication networks and application (ICSCNA), IEEE, pp 1706–1711
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al (2020) An image is worth 16x16 words: transformers for image recognition at scale, arXiv preprint arXiv:2010.11929
Es-saady Y, El Massi I, El Yassa M, Mammass D, Benazoun A (2016) Automatic recognition of plant leaves diseases based on serial combination of two svm classifiers. In: International conference on electrical and information technologies (ICEIT). IEEE, pp 561–566
Fang J, Yang G, Han A, Liu X, Chen B, Wang C (2024) Zero-shot learning via categorization-relevant disentanglement and discriminative samples synthesis. Vis Comput 29:1–13
Google Scholar
Frome A, Corrado GS, Shlens J, Bengio S, Dean J, Ranzato M, Mikolov T (2013) Devise: a deep visual-semantic embedding model. In: Advances in neural information processing systems 26 (NIPS 2013). pp 1–11
Fu Y, Xiang T, Jiang Y-G, Xue X, Sigal L, Gong S (2018) Recent advances in zero-shot recognition: toward data-efficient understanding of visual content. IEEE Signal Process Maga 35(1):112–125
Article Google Scholar
Garg S, Dixit D, Yadav SS (2022) Disease detection in plants using knn algorithm. In: 2022 4th international conference on advances in computing, communication control and networking (ICAC3N), IEEE, pp 938–943
Gopi SC, Kondaveeti HK (2023) Transfer learning for rice leaf disease detection. In: 2023 third international conference on artificial intelligence and smart energy (ICAIS), IEEE, pp 509–515
Gurunathan V, Dhanasekar J, Suganya S, et al (2023) Plant leaf diseases detection using knn classifier. In: 2023 9th international conference on advanced computing and communication systems (ICACCS), IEEE, pp 2157–2162
Han Z, Fu Z, Chen S, Yang J (2021) Contrastive embedding for generalized zero-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2371–2381
Han J, Gao Y, Lin Z, Yan K, Ding S, Gao Y, Xia GS (2024) Dual relation mining network for zero-shot learning, arXiv preprint arXiv:2405.03613
Harshavardhan K, Krishna PA, Geetha A (2023) Detection of various plant leaf diseases using deep learning techniques. In: 2023 international conference on advances in computing, communication and applied informatics (ACCAI), IEEE, pp 1–6
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hossain E, Hossain MF, Rahaman MA (2019) A color and texture based approach for the detection and classification of plant leaf disease using knn classifier. In: International conference on electrical, computer and communication engineering (ECCE). IEEE, pp 1–6
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Joseph JJ, George JV, Manoj KM, Reji MM, Mathew J (2023) Fusing deep cnn and local binary pattern for accurate plant disease diagnosis. In: 2023 annual international conference on emerging research areas: international conference on intelligent systems (AICERA/ICIS), IEEE, pp 1–6
Karessli N, Akata Z, Schiele B, Bulling A (2017) Gaze embeddings for zero-shot image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4525–4534
Kollem S, Poojitha K, Chary NB, Saicharan P, Anvesh K, Peddakrishna S, Prasad CR (2024) Advancing agriculture: plant disease classification through cutting-edge deep learning techniques. In: 2024 14th international conference on cloud computing, data science & engineering (Confluence), IEEE, pp 914–919
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Article Google Scholar
Kumar BD, Pandey H, Prakash N, Tripathi P (2023) One-shot learning methodology for plant leaf disease identification. In: 2023 3rd international conference on advance computing and innovative technologies in engineering (ICACITE), IEEE, pp 719–724
Kumar DA, Chakravarthi PS, Babu KS (2020) Multiclass support vector machine based plant leaf diseases identification from color, texture and shape features. In: Third international conference on smart systems and inventive technology (ICSSIT). IEEE, pp 1220–1226
Lakshmanarao A, Babu MR, Kiran TSR (2021) Plant disease prediction and classification using deep learning convnets. In: 2021 international conference on artificial intelligence and machine vision (AIMV), IEEE, pp 1–6
Lin H, Tse R, Tang S-K, Qiang Z-P, Pau G (2022) The positive effect of attention module in few-shot learning for plant disease recognition. In: 2022 5th international conference on pattern recognition and artificial intelligence (PRAI), IEEE, pp 114–120
Liu B, Hu L, Dong Q, Hu Z (2021) An iterative co-training transductive framework for zero shot learning. IEEE Trans Image Process 30:6943–6956
Article Google Scholar
Liu Y, Dang Y, Gao X, Han J, Shao L (2022) Zero-shot learning with attentive region embedding and enhanced semantics. IEEE Trans Neural Netw Learn Syst 35(3):4220–4231
Article Google Scholar
Liu Y, Zhou L, Bai X, Huang Y, Gu L, Zhou J, Harada T (2021) Goal-oriented gaze estimation for zero-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3794–3803
Lv F, Liu H, Wang Y, Zhao J, Yang G (2020) Learning unbiased zero-shot semantic segmentation networks via transductive transfer. IEEE Signal Process Lett 27:1640–1644
Article Google Scholar
Ma N, Zhang X, Zheng H-T, Sun J (2018) Shufflenet v2: practical guidelines for efficient CNN architecture design. In: Proceedings of the European conference on computer vision (ECCV), pp 116–131
Militante SV, Gerardo BD, Dionisio NV (2019) Plant leaf detection and disease recognition using deep learning. In: IEEE Eurasia conference on IOT, communication and engineering (ECICE). IEEE, pp 579–582
Morellos A, Pantazi XE, Paraskevas C, Moshou D (2022) Comparison of deep neural networks in detecting field grapevine diseases using transfer learning. Remote Sens 14(18):4648
Article Google Scholar
Mukhtar H, Khan MZ, Khan MUG, Younis H (2021) Wheat disease recognition through one-shot learning using fields images. In: 2021 international conference on artificial intelligence (ICAI), IEEE, pp 229–233
Naeem MF, Xian Y, Tombari F, Akata Z (2021) Learning graph embeddings for compositional zero-shot learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 953–962
Nuthalapati SV, Tunga A (2021) Multi-domain few-shot learning and dataset for agricultural applications. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1399–1408
Pal A, Kumar V (2023) Agridet: plant leaf disease severity classification using agriculture detection framework. Eng Appl Artif Intell 119:105754
Article Google Scholar
Parameshachari B, DS SK, NaveenKumar H, Deepak R, Sudheesh K, et al (2023) Plant disease detection and classification using transfer learning inception technique. In: 2023 International conference on data science and network security (ICDSNS), IEEE, pp 1–6
Pavan CHT, Sadha CK, Harshini P, Annepu V, Bagadi K, Chirra VRR (2023) Plant leaf disease classification using transfer learning using efficientnetb5. In: 2023 international conference on next generation electronics (NEleX), IEEE, pp 1–6
Rahim MA, Akter R, Reza A, Rahman T, Alam MS (2023) Deep learning based method to predict plant diseases: a case study with rice plant disease classification. In: 2023 26th international conference on computer and information technology (ICCIT), IEEE, pp 1–6
Ramanathan G, et al (2023) Plant leaf diseases prediction using butterfly optimization bo with support vector machine svm. In: 2023 fifth international conference on electrical, computer and communication technologies (ICECCT), IEEE, pp 1–4
Rani KA, Gowrishankar S (2023) Pathogen-based classification of plant diseases: a deep transfer learning approach for intelligent support systems. Piscataway, IEEE Access
Google Scholar
Rohilla N, Rai M (2022) Automatic image segmentation and feature extraction of potato leaf disease using glcm and hog features. In: 2022 4th international conference on advances in computing, communication control and networking (ICAC3N), IEEE, pp 1226–1232
Romera-Paredes B, Torr P (2015) An embarrassingly simple approach to zero-shot learning. In: International conference on machine learning, PMLR, pp 2152–2161
Saha S, Ahsan SMM (2021) Rice disease detection using intensity moments and random forest. In: 2021 international conference on information and communication technology for sustainable development (ICICT4SD), IEEE, pp 166–170
Sahithi B, Vigneshwari S (2023) Exploring the performance of vgg16 and efficient net models for plant disease classification: a comparative approach. In: 2023 first international conference on cyber physical systems, power electronics and electric vehicles (ICPEEV), IEEE, pp 1–6
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC, (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520
Sarkar S, Ray JA, Mukherjee C, Ghosh S, KR CL, et al (2023) Plant leaf disease classification based on svm based densenets. In: 2023 international conference on advances in computation, communication and information technology (ICAICCIT), IEEE, pp 636–641
Shill A, Rahman MA (2021) Plant disease detection based on yolov3 and yolov4. In: 2021 International conference on automation, control and mechatronics for industry 4.0 (ACMI), IEEE, pp 1–6
Shivaprasad K, Wadhawan A (2023) Deep learning-based plant leaf disease detection. In: 2023 7th international conference on intelligent computing and control systems (ICICCS), IEEE, pp 360–365
Showrav TT, Bain S, Hossain M, Ahmed KI, Fattah SA, Ahmed S (2022) A two-stage approach for plant disease classification based on deep neural networks and transfer learning. In: 2022 12th international conference on electrical and computer engineering (ICECE), IEEE, pp 469–472
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556
Singh RSR, Sanodiya RK (2023) Zero-shot transfer learning framework for plant leaf disease classification. IEEE Access. https://doi.org/10.1109/ACCESS.2023.3343759
Article Google Scholar
Singh G, Guleria K, Sharma S (2023) A deep learning-based fine-tuned convolutional neural network model for plant leaf disease detection. In: 4th IEEE global conference for advancement in technology (GCAT). IEEE, pp 1–6
Soujanya K, Jabez J (2021) Recognition of plant diseases by leaf image classification based on improved alexnet. In: 2021 2nd international conference on smart electronics and communication (ICOSEC), IEEE, pp 1306–1313
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Tan M, Chen B, Pang R, Vasudevan V, Sandler M, Howard A, Le QV (2019) Mnasnet: platform-aware neural architecture search for mobile. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2820–2828
Tandekar D, Dongre S (2023) A review on various plant disease detection using image processing. In: 2023 3rd international conference on pervasive computing and social networking (ICPCSN), IEEE, pp 552–558
Tan M, Le Q (2019) Efficientnet: rethinking model scaling for convolutional neural networks. In: International conference on machine learning, PMLR, pp 6105–6114
Thotad PN, Kallur S, Nandeppanavar A (2023) An efficient model for plant disease detection in agriculture using deep learning approaches. In: 4th IEEE global conference for advancement in technology (GCAT). IEEE, pp 1–6
Tumpa SB, Halder KK (2023) A comparative study on different transfer learning approaches for identification of plant diseases. In: 2023 international conference on next-generation computing, IoT and machine learning (NCIM), IEEE, pp 1–6
Tunio MH, Jianping L, Butt MHF, Memon I (2021) Identification and classification of rice plant disease using hybrid transfer learning. In: 18th international computer conference on wavelet active media technology and information processing (ICCWAMTIP). IEEE, pp 525–529
Vaishnnave M, Devi KS, Srinivasan P, Jothi GAP (2019) Detection and classification of groundnut leaf diseases using knn classifier. In: 2019 IEEE international conference on system, computation, automation and networking (ICSCAN), IEEE, pp 1–5
Wang J, Li Y, Pang Z, Wang D (2018) Generating manifold-aligned semantic feature for zero-shot learning. In: 2018 25th IEEE international conference on image processing (ICIP), IEEE, pp 1613–1617
Wang Y, Wang S (2021) Imal: an improved meta-learning approach for few-shot classification of plant diseases. In: 2021 IEEE 21st international conference on bioinformatics and bioengineering (BIBE), IEEE, pp 1–7
Zhang L, Zhang G (2024) Semantic feedback for generalized zero-shot learning. In: 2024 4th international conference on neural networks, information and communication (NNICE), IEEE, pp 298–302
Zhao C, Lu S, Guo X, et al (2010) Svm-based multiple classifier system for recognition of wheat leaf diseases. In: Proceedings of 2010 conference on dependable computing (CDC ’2010)
Zhen Z, Sun H, Liu Y, Zhang P, (2023) Embedded zero-shot learning algorithm based on semantic attributes. In: 2023 international conference on advances in electrical engineering and computer applications (AEECA), IEEE, pp 881–886
Zhou C, Zhang Z, Zhou S, Xing J, Wu Q, Song J (2021) Grape leaf spot identification under limited samples by fine grained-gan. IEEE Access 9:100480–100489
Article Google Scholar

Download references

Acknowledgements

This work was partially supported by the Science and Engineering Research Board (SERB) under Core Research Grant Project (File No. CRG/2023/001239).

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology Patna, Patna, Bihar, India
Pranav Kumar & Jimson Mathew
Department of Computer Science and Engineering, Indian Institute of Information Technology, Sri City, Chittoor, India
Rakesh Kumar Sanodiya, Thanush Setty & Bhanu Prakash Bhaskarla

Authors

Pranav Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Jimson Mathew
View author publications
You can also search for this author in PubMed Google Scholar
Rakesh Kumar Sanodiya
View author publications
You can also search for this author in PubMed Google Scholar
Thanush Setty
View author publications
You can also search for this author in PubMed Google Scholar
Bhanu Prakash Bhaskarla
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rakesh Kumar Sanodiya.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Kumar, P., Mathew, J., Sanodiya, R.K. et al. Zero shot plant disease classification with semantic attributes. Artif Intell Rev 57, 305 (2024). https://doi.org/10.1007/s10462-024-10950-9

Download citation

Accepted: 10 September 2024
Published: 30 September 2024
DOI: https://doi.org/10.1007/s10462-024-10950-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Zero shot plant disease classification with semantic attributes

Abstract

Explore related subjects

1 Introduction

2 Related work

2.1 Traditional methods approaches

2.1.1 Deep learning approaches

2.1.2 Transfer learning and domain adaptation approaches

2.1.3 Zero shot transfer learning approaches

3 Methodology

3.1 Feature extraction using pretrained model

3.2 Incorporation of semantic attributes

3.3 Feature concatenation

3.4 Formulation of loss function

3.5 Training the model

4 Experiments

4.1 Setup

4.1.1 Dataset

4.1.2 Baseline method

4.1.3 Implementation details

4.1.4 Parameters

4.2 Experimental results

4.3 Experimental analysis

4.4 Convergence analysis

5 Conclusion and future works

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation