A Novel Feature Selection Approach-Based Sampling Theory on Grapevine Images Using Convolutional Neural Networks

Özaltın, Öznur; Koyuncu, Nursel

doi:10.1007/s13369-024-09192-2

A Novel Feature Selection Approach-Based Sampling Theory on Grapevine Images Using Convolutional Neural Networks

Research Article-Computer Engineering and Computer Science
Open access
Published: 25 June 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Arabian Journal for Science and Engineering Aims and scope Submit manuscript

A Novel Feature Selection Approach-Based Sampling Theory on Grapevine Images Using Convolutional Neural Networks

Download PDF

225 Accesses
1 Altmetric
Explore all metrics

Abstract

Feature selection, reducing number of input variables to develop classification model, is an important process to reduce computational and modeling complexity and affects the performance of image process. In this paper, we have proposed new statistical approaches for feature selection based on sample selection. We have applied our new approaches to grapevine leaves data that possess properties of shape, thickness, featheriness, and slickness that are investigated in images. To analyze such kind of data by using image process, thousands of features are created and selection of features plays important role to predict the outcome properly. In our numerical study, convolutional neural networks have been used as feature extractors and then obtained features from the last average pooling layer to detect the type of grapevine leaves from images. These features have been reduced by using our suggested four statistical methods: simple random sampling, ranked set sampling, extreme ranked set sampling, moving extreme ranked set sampling. Then, selected features have been classified with artificial neural network and we obtained the best accuracy of 97.33% with our proposed approaches. Based on our empirical analysis, it has been determined that the proposed approach exhibits efficacy in the classification of grapevine leaf types. Furthermore, it possesses the potential for integration into various computational devices.

Active Learning for Crop-Weed Discrimination by Image Classification from Convolutional Neural Network’s Feature Pyramid Levels

Improved Convolutional Neural Networks for Hyperspectral Image Classification

Leaf recognition using convolutional neural networks based features

Article 10 June 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The annual grapevine leaves harvest yields an additional product for the field of agriculture especially in Turkey. The classification of grapevine leaves holds significance with regard to their valuation and flavor characteristics [1, 2]. Various kinds of grapevine leaves exhibit distinct leaf attributes, encompassing criteria such as shape, depth, length, featheriness, and sliciness, which display considerable variation [1, 3, 4]. Due to this reason, the grapevine leaves of every variant are not utilized for culinary purposes. Consumers exhibit a lack of preference toward leaves that possess a substantial thickness, with feathers, and are sliced. The optimal choice for culinary purposes is a grapevine leaves cultivar that possesses a slender morphology, devoid of feathers, exhibiting delicate venation, sliced to the most thinness, and imparting a tangy sensation upon the tasting receptors [1]. Hence, the categorization of consumable species from non-consumable grapevine leaves species and the identification of them based on leaf and fruit images are crucial prerequisites in this domain. However, for individuals lacking specialized knowledge, discerning the type of grapevine leaves poses a considerable challenge [1].

Deep learning algorithms have been utilized to develop prediction in classification models. Convolutional neural networks (CNNs) are a variant of deep learning algorithms and they are widely used for image classification or prediction in many fields [5,6,7,8,9,10,11,12]. However, it is important to note that CNNs by themselves may not yield optimal levels of accuracy every time. CNNs possess the capability to perform automated feature extraction without the need for manual crafting and thus, researchers benefit from this property of CNNs [1, 13, 14].

In this study, we prefer to use pre-trained architectures as opposed to developing novel CNNs due to encountered challenges [15]. The pre-trained architectures selected including DarkNet-53 [16], GoogleNet [17], InceptionResNet_v2 [18], NASNetMobile [19], and ResNet-18 [20]. These choices are made due to their frequent utilization in the field. The appointed performs of these architectures involve serving as deep feature generators, wherein their final layer produces features of varying sizes. However, a subset of these features exhibits noise, and employing the entire set results in increased computational complexity. The task of determining the significance of features and performing feature reduction is a highly crucial problem. In order to achieve optimal outcomes in image detection, it is advantageous to leverage feature selection techniques and machine learning algorithms. Various feature selection methods can be employed for this purpose in the literature, including neighborhood component analysis (NCA) [21], principle component analysis (PCA) [22], Chi-square [23], minimum redundancy maximum relevance (mRMR) [24] and proper orthogonal decomposition (POD) [25], etc. For example, POD is a statistical method that reduces feature size. This method involves determining orthonormal basis functions and time-dependent orthonormal amplitude coefficients. Therefore, POD reduces dimensionality using linear single-value decomposition. Nevertheless, during the application of these feature selection methods, researchers may face challenges related to computational complexity, assumptions, or time-consuming processes. In this study, we focus on overcoming these issues, particularly the challenges associated with computational complexity. The objective of this paper is to propose novel methodologies for feature selection utilizing sampling theory (SRS, RSS, ERSS, and MERSS) and analyze their impact on the performance of classification models. In order to conduct a comparative analysis between our proposed and existing method, we utilize the grapevine leaves dataset. Experimental results show that our proposed method is superior than the others. Then, reduced features using our proposed method are classified via ANN. Therefore, our proposed approach develops a novel hybrid algorithm-based CNN, new feature selection method, and ANN.

In the pursuit of our objective, we attempt to employ based on CNNs to discern and classify grapevine leaf images, thus helping with the identification of plant species.

1.1 Novelties and Contribution of this Study

Feature selection is an important and frequently used technique for dimension reduction by removing irrelevant and redundant information from the data set to obtain an optimal feature subset. It brings the immediate effects of speeding up a data mining algorithm, improving learning accuracy, and enhancing model comprehensibility. The research in feature selection has been a challenging field, some researchers have doubted about its computational feasibility [26]. For these reasons, this topic becomes one of the key problems for machine learning and data mining. In this paper, we have introduced a new approach to feature selection-based sampling theory and searched for effectiveness. Second important part of this study is classification of grapevine leaves. Traditional identification methods for grapevine leaves often rely on the knowledge and experience of experts, and it is difficult to identify [27].

We can list the main contributions and novelties of this study step by step are as follows:

Obtained original grapevine leaves images having five classes: Ak, Ala Idris, Büzgülü, Dimnit, and Nazli from the website http://www.muratkoklu.com /datasets/Grapevine_Leaves_Image_Dataset.rar
Utilized fivefold cross-validation to obtain reliable outcomes for DarkNet-53, GoogleNet, InceptionResNet_v2, NASNetMobile, and ResNet-18.
Compared these pre-trained architectures utilizing the softmax layer for the purpose of classifying grapevine leaves images.
Employed these architectures for extracting features from images. Extracted features were obtained specifically from the average pooling layer of the respective architecture.
Calculated feature weights via Mahalanobis distance metric and ordered the weights in descending order by an inexpensive method.
Ordered features along with their weights and the selection of features was associated with weights.
Proposed novel feature selection methods based on sampling theory: SRS, RSS, ERSS, MERSS.
Identified the number of features and selected important features according to the methodological characteristics of the methods.
Classified selected features through artificial neural network (ANN).
Investigated performance of classification on the hybrid algorithms from grapevine leaf images.
Compared these suggested methods with NCA on the performance of classification.
Finally, the highest accuracy is obtained by using DarkNet53-MERSS-ANN hybrid algorithm. Figure 1 shows the pipeline of this presented study.

1.2 Literature Review

In recent years, scientific investigations have primarily concentrated on the examination of disease identification and species classification through the utilization of leaf images, as documented in current collections of literature [1, 28, 29].

Tiwari, et al. [30] performed a deep learning-based system to detect plant diseases and classify various types. Besides, they implemented five cross-validations while training the dataset, which has 27 different classes. As a result, they obtained an average cross-validation accuracy of 99.58% and an average test accuracy of 99.199%.

Ahila Priyadharshini, et al. [31] aimed to identify crop disease from maize leaf images via their proposed convolutional neural network (CNN). In fact, they modified LeNet and trained four different classes—three diseases and one health class—from the PlantVillage dataset.

Azim, et al. [32] utilized decision trees, which are one of the machine learning algorithms, to detect three different rice leaf diseases from images. They manually extracted features from images such as color, shape, and texture. Lastly, their study achieved an accuracy of 86.58%.

Sembiring, et al. [33] focused on detecting tomato leaf diseases, which are classified into nine different classes via CNNs architectures: very deep convolutional neural networks (VGG), ShuffleNet, and SqueezeNet. They also used healthy leaves to distinguish themselves from the competition. In total, 10 different classes were utilized and classified with these architectures. Finally, the study obtained the highest accuracy of 97.15%.

Zhang et al. [34] proposed a novel approach to detecting cucumber leaf disease. Firstly, they segmented disease by using K-means clustering, and then they extracted features such as shape and color from lesion information. Lastly, they classified leaf images to detect disease utilizing sparse representation (SR). At the end of the study, they obtained an accuracy of 85.7% with this approach.

Sladojevic, et al. [35] created a model using the CNNs algorithm to distinguish 13 different plant diseases from leaf images via Caffe. Finally, their study achieved an average precision of 96.3%.

Kan, et al. [36] searched for medicinal plants, which are essential in traditional Chinese medicine, via a support vector machine (SVM). Before the classification stage, image features such as shape and texture are extracted for each of the 12 different leaf types. When the features are classified via SVM, application results achieve an average accuracy of 93.3%.

Koklu, et al. [1] performed grapevine leaf image classification using MobileNetv2, which is one of the pre-trained convolutional neural network architectures. Their dataset consists of five different classes and 500 grapevine leaf images. While they classify the dataset via MobileNetv2, they do not find it sufficient and then combine it with SVM to obtain the best classification results. Prior to this merging, the feature selection method, Chi-Square, is applied, and then which kernel is successful is investigated. At the end of the study, they expressed that the best kernel is Cubic with 250 features selected and an accuracy of 97.60%.

Dudi and Rajesh [37] introduced a novel deep learning hybrid algorithm to identify leaf types. Their algorithm includes enhanced CNN with optimization methods for activation functions and hidden neurons. Their proposed method is the Shark Smell-Based Whale Optimization Algorithm (SS-WOA). Besides, they tested this hybrid algorithm on untrained and collected leaf images and obtained an accuracy of 86%. In addition to these studies, Table 1 displays state-of-the-art studies belonging to leaf image classification.

Table 1 State-of-the-art studies on leaf images in the literature

Full size table

The rest of this study is organized as follows: the grapevine leaves dataset and used methods are given in Sect. 2. Next, we present experimental results, performance metrics, fine tuning parameters, and cross-validation in Sect. 3. Then, we have discussed advantages and disadvantages of this study in Sect. 4. Eventually, we have finalized this study and express the future works, in Sect. 5.

2 Methods

2.1 Dataset of Grapevine Leaves

Plants play an important role in the world [49]. In nature, there are many species of plants, and their detection is so difficult and time-consuming [50]. Grapevine leaf is also a special plant that has different properties such as shape, thickness, featheriness, and slickness, and detecting it is quite hard by the naked eye. Traditional identification methods often rely on the knowledge and experience of experts [27].

Bodor-Pesti et al. [51] summarized the efforts of metric characterization of the grapevine leaf with the introduction of the scientific objectives and reviewing the studies showing the innovations in phenotyping during the past 120 years. The International Organization of Vine and Wine is one of the most important institutions in the viti-viniculture sector, providing statistical data about the World’s viticulture and oenology. They organize events and share standardized manuals for the description of grapevine genotypes. In 2009, OIV published the second Edition of the “OIV Descriptor List for Grape Varieties and Vitis Species” which contains more than 150 descriptor traits for the purposes of characterization and identification. (For more details kindly see Bodor-Pesti et al. [51]).

The dataset includes grapevine leaf images with five classes: Ak, Ala Idris, Büzgülü, Dimnit, and Nazli. Figure 2 displays classes of grapevine leaves which obtained by traditional identification methods rely on the knowledge and experience of experts. After closer examination of Fig. 2, it becomes apparent that there are no discernible differences among the various grapevine leaf classes. Identifying grapevine leaves can be challenging for those without expertise in the field.

In total, there are 500 images, consisting of 100 images for each class. Besides, all images have RGB (red, green, and blue) format and 512 × 512x3 dimensions. This dataset was created by Koklu, et al. [1] and obtained from the website http://www.muratkoklu.com/datasets/Grapevine_Leaves_Image_Dataset.rar.

In this study, we did not resize the images during the preprocessing phase manually. However, each pre-trained architecture may accept different input sizes of the image. Therefore, we perform the data augmentation process automatically to resize the dimension that is accepted as the input size for each pre-trained architecture. Table 2 displays the input size of each architecture.

Table 2 Properties of pre-trained architectures

Full size table

2.2 Deep Feature Extractors

With the implosive advance in data and the fast development of algorithms such as machine learning and deep learning, artificial intelligence (AI) has obtained novelties in a wide range of applications [52]. Notably, researchers prefer deep learning algorithms to analyze images due to their ability to extract features [1, 6, 48, 53,54,55]. When desiring to classify any image using a machine learning algorithm, the features of the image are extracted manually through a process known as hand-crafting. Nonetheless, this circumstance is time-consuming and requires expert consideration. To analyze images from any field, it is difficult to locate specialists; consequently, results cannot be obtained rapidly. The feature extraction problem can now be handled by CNNs [6].

In this study, we automatically extract features from grapevine images using the networks DarkNet-53, GoogleNet, InceptionResNet_v2, NASNetMobile, and ResNet-18. Table 2 exhibits the number of parameters, the layers, the input size, and the years in which pre-trained architectures have been developed. Furthermore, the next sub-section will provide a concise presentation of these architectures.

2.2.1 DarkNet-53

Darknet-53, a convolutional neural network (CNN) developed by Redmon and Farhadi [16], is the primary module for extracting features in order to identify objects within the Yolov3 network [56]. The architecture comprises a total of 53 deep convolutional layers, and it is denoted as DarkNet-53 due to the specific count of these layers. Indeed, there exist repetition blocks, resulting in a total number of layers amounting to 106. The specified architecture is designed to accommodate an image input with dimensions of 256 × 256. Table 3 presents comprehensive information regarding the architectures. Furthermore, it has been observed that DarkNet-53 exhibits superior performance in the context of classification and extraction of features within the scope of this investigation.

Table 3 DarkNet-53 details [16]

Full size table

2.2.2 GoogleNet

The GoogleNet architecture was proposed by Szegedy, et al. [17]. The architecture exhibits a multitude of layers, including two convolutional layers, four max-pooling layers, nine inception layers, a global average pooling layer, a dropout layer, a linear layer, and a softmax layer. GoogleNet is composed of a total of 22 layers, which are deeper in nature. It effectively employs activation layers using the rectified linear unit (ReLU) function. GoogleNet consists of a total of 7 million parameters.

2.2.3 InceptionResNet

The InceptionResNet_v2 model is a fusion of the ResNet and Inception architectures, as proposed by Szegedy, et al. [18] 57. The architecture executes residual connections in an efficient manner, rather than employing connection filtering to enhance performance and accelerate training time [58].

2.2.4 NasNetMobile

The NASNetMobile architecture, developed by Zoph, et al. [19], aims to explore optimal CNNs structures using reinforcement learning methods. The team from Google Brain, as presented by Addagarla, et al. [59], has made significant advancements in the field of Neural Architecture Search (NAS). While NAS architectures exhibit variations in their sizes, it is worth noting that NasNetMobile represents a scaled-down iteration. The parameter count of NASNetMobile is approximately 4.5 million. The accepted input image size is 224 × 224 pixels.

2.2.5 ResNet-18

The architecture known as ResNet-18, as described by He, et al. [20], consists of a total of 72 layers, with 18 of them being deep layers. Moreover, it was developed in the year 2016. This architecture aims to efficiently provide a multitude of convolutional layers for functioning. The core principle of ResNet involves implementing skip connections, commonly referred to as shortcut connections. During this iterative process, the interconnection compresses the underlying structure, leading to accelerated learning within the network. The structure is recognized as a directed acyclic graph (DAG) network due to its intricate layered configuration [60].

These architectures have been employed for the purpose of both classification and the generation of deep features. When they are utilized as the classifiers, the softmax layer is applied. However, in the case of utilizing deep feature generators, the implementation of the softmax layer is absent, resulting in the acquisition of deep features widely from the last layers. For example, DarkNet-53 yields a feature vector of dimension 1024. However, it is necessary to decrease the number of features in order to attain optimal performance. In the context of this study, our objective is to design innovative techniques for selecting features.

2.3 A Novel Feature Selection Approach Based on Sampling Theory

In this study, we aim to find solutions to these problems, with a special emphasis on those involving computational complexity. Therefore, we present a novel feature selection methodologies SRS, RSS, ERSS, and MERSS based on sampling theory for improving the classification performance of grapevine leaves images. In the next sub-sections, we introduce the proposed methods, and overall Algorithm process.

2.3.1 Simple Random Sampling (SRS)

Simple random sampling (SRS) is a very common sampling design used by many researchers because of practicality. To improve the precision new sampling designs are also suggested in literature. One of these designs is Ranked set sampling (RSS) was first proposed by McIntyre [61] as an alternative the SRS. When we compare both sampling designs for the same sample size, we can say that RSS becomes more efficient than SRS as long as a more accurate and accessible ranking criterion is available for increasing grapevine leaves classification performance. For detail literature; kindly see Zamanzade and Mahdizadeh [62], Bouza-Herrera and Al-Omari [63], Koyuncu and Al-Omari [64], etc. In big data literature, there are many important studies using sampling designs to reduce computational complexity, challenge imbalanced data and increase the precision [65, 66]. In this study, following new ranked set sampling designs, we have proposed new procedure for feature selection weights.

2.3.2 Ranked Set Sampling (RSS) Procedure for Feature Selection

Following McIntyre [61], we can define a new procedure for feature selection as:

1.
Order the weights in descending order by an inexpensive method.
2.
Select “n” features of size n, respectively, called sets.
3.
Measure accurately the first ordered feature from the first set, the second ordered feature from the second set. The process continues in this way until the maximum ordered feature from the last n-th set is measured.

Note that we can select “Integer (sqrt (weight_size)” size of feature with this procedure.

2.3.3 Extreme Ranked Set Sampling (ERSS) Procedure for Feature Selection

When the set size n is large, RSS may have ranking errors. As an attempt to overcome this problem several variations of RSS have been proposed by researchers. The main idea of ERSS is that the identification of the maximum rank is much easier than the determination of the all ranks [67]. We can define a new procedure called “ERSS Procedure” for feature selection as:

1.
Order the weights in descending order by an inexpensive method.
2.
Select “n” features of size n, respectively, called sets.
3.
Measure accurately the maximum ordered feature from the first set, the maximum ordered feature from the second set. The process continues in this way until the maximum ordered feature from the last n-th set is measured.

Note that we can select same feature size with RSS and unlike the classical ERSS defined by Samawi, et al. [67], in our approach we only take into account maximum values instead of minimum.

2.3.4 Moving Extreme Ranked Set Sampling (MERSS) Procedure for Feature Selection

Another modification of RSS, namely moving extreme ranked set sampling (MERSS), was introduced by Al-Odat and Al-Saleh [68]. Following Al-Odat and Al-Saleh [68], we have suggested following procedure to select feature weights:

Order the weights in descending order by an inexpensive method.
Select “n” features of size 1, 2, 3, …, n, respectively, called sets.
Measure accurately the maximum ordered feature from the first set, the maximum ordered feature from the second set. The process continues in this way until the maximum ordered feature from the last n-th set is measured.

This modification of RSS, in addition of being easier to execute than both usual RSS and fixed size extreme RSS, keeps some of the balancing inherited in the usual RSS. Hence, it can be concluded that the MERSS algorithm exhibits superior efficiency compared to other methods in the task of feature selection, resulting in a significantly enhanced performance in the classification of grapevine leaves images.

According to our proposed feature selection methods, we select significant features, which are subsequently subjected to classification using an ANN with high efficiency.

2.4 Classification via Artificial Neural Network (ANN)

Artificial neural networks (ANNs), as introduced by McCulloch and Pitts [69], emerged as a result of studying brain functionality and subsequently found application in computer programs [6, 70, 71]. In addition, it is important to note that any ANN comprises numerous individual units, commonly referred to as neurons or processing elements (PE). These units are interconnected through weights, which facilitate the neural structure of the network. Furthermore, these interconnected units are typically organized in layers to ensure proper coordination and functioning of the ANN [72].

ANN originates from a prosperous lineage of nonlinear algorithms. When utilized in the domain of machine learning, particularly in the context of supervised learning, the outcomes have exhibited considerable success in recent times. Additionally, artificial neural networks (ANN) possess a flexible architecture that can be effectively employed to accommodate a wide range of real-world datasets [73]. One may consider referring to the book authored by H. Jiang (2021) in order to acquire specific information.

Based on the aforementioned considerations, the implementation of an ANN is carried out in this study for the purpose of efficient classification of grapevine leaves images. This decision is based on the fundamental principles established by Ozaltin, et al. [6] in their study. When implementing an ANN, a configuration is chosen where there are 100 hidden layers and each layer has 5 neurons followed by a softmax layer. Furthermore, ReLU activation layer is employed to enhance the efficiency of the algorithm. The training solver has selected the limited-memory Broyden–Fletcher–Goldfarb–Shanno (LBFGS) optimization algorithm from quasi-Newton algorithm family. Additionally, the maximum number of iterations is set to 1000, while the learning rate, minimum gradient tolerance, and loss tolerance are assigned values of 0.02, 1e-4, and 1e-5, correspondingly. By careful choosing these parameters, we can achieve improved classification performance.

2.5 Performance Metrics

In this study, to measure of algorithms’ performance metrics, accuracy, area of under curve (AUC), F1-measure, geometric mean (G-mean), kappa value, precision, and recall were used, in addition, expressed Eq. (1)–(5) as follows:

$${\text{Accuracy = (TP + TN)/(TP + TN + FP + FN)}}$$

(1)

$$F1 - {\text{Measure}} = {{\left( {2 \times {\text{TP}}} \right)} \mathord{\left/ {\vphantom {{\left( {2 \times {\text{TP}}} \right)} {\left( {2 \times {\text{TP}} + {\text{FP}} + {\text{FN}}} \right)}}} \right. \kern-0pt} {\left( {2 \times {\text{TP}} + {\text{FP}} + {\text{FN}}} \right)}}$$

(2)

$${{{\text{Recall}} = {\text{TP}}} \mathord{\left/ {\vphantom {{{\text{Recall}} = {\text{TP}}} {\left( {{\text{TP}} + {\text{FN}}} \right)}}} \right. \kern-0pt} {\left( {{\text{TP}} + {\text{FN}}} \right)}}$$

(3)

$${{{\text{Precision}} = {\text{TP}}} \mathord{\left/ {\vphantom {{{\text{Precision}} = {\text{TP}}} {\left( {{\text{TP}} + {\text{FP}}} \right)}}} \right. \kern-0pt} {\left( {{\text{TP}} + {\text{FP}}} \right)}}$$

(4)

$$G - {\text{Mean}} = \sqrt {{\text{Sensitivity}}\,\, \times \,\,{\text{Specificity}}}$$

(5)

where $\text{TP}$: True positive, $\text{FP}$: false positive, $\text{TN}$: true negative, and $\text{FN}$: false negative are shown [8, 71, 74, 75].

The above performance metrics are widely employed to compute the performance of the classifier. In this study, not only calculated these metrics but also kappa value ($\kappa$) is acquired which algorithms performances are acceptable. If $\kappa$ is close to 1, we can say that the results are perfect. Otherwise, if $\kappa$ is close to 0, the results are unacceptable [76]. Equation (6) and (7) are employed to evaluate $\kappa$ as shown in Eq. (8).

$$p_{{\text{A}}} = {\text{Accuracy}}$$

(6)

$$p_{{\text{E}}} = \frac{{\left( {{\text{TP}} + {\text{TN}}} \right)\left( {{\text{TP}} + {\text{FP}}} \right) + \left( {{\text{FP}} + {\text{TN}}} \right)\left( {{\text{TP}} + {\text{FN}}} \right)}}{{\left( {{\text{TP}} + {\text{TN}} + {\text{FP}} + {\text{FN}}} \right)^{2} }}$$

(7)

$$\kappa = \max \left( {\frac{{p_{A} - p_{E} }}{{1 - p_{E} }};\frac{{p_{E} - p_{A} }}{{1 - p_{A} }}} \right)$$

(8)

2.6 Cross Validation

The deep learning algorithm lacks the capability to communicate coherently using mathematical models, resulting in a lack of clarity regarding the transformation of inputs into outputs [77]. Hence, the algorithm is denoted as encapsulating a black-box. The k-fold cross-validation method is commonly preferred by researchers in order to obtain reliable outcomes [6, 78,79,80]. Moreover, this algorithm effectively mitigates the problem of overfitting during data analysis [80]. In this algorithmic procedure, the dataset gets a process of partitioning into a set of subsets, wherein the number of subsets is randomly determined. One of these subsets is then designated as the test set, while the remaining subsets are utilized for training the structure [81]. The algorithm is iterated for k folds and evaluated using the framework proposed [82]. In this study, the variable k is assigned a value of 5 to ensure reliable outcomes in the classification process.

2.7 Fine Tuning Parameters

In this study, the determination of fine-tuning parameters is crucial for attaining optimal outcomes and ensuring fair comparability with DarkNet-53, GoogleNet, NasNetMobile, InceptionResNet_v2, and ResNet-18. The identified parameters for fine-tuning are as follows: stochastic gradient descent (sgdm) is used as the optimizer, the learning rate is set to 0.0001, the maximum number of epochs is 10, the minibatch size is 8, and a constant learning rate schedule is utilized.

Following the utilization of methods and parameters, the pseudocode including the entirety of the algorithm can be accessed below.

Algorithm

This study of procedures to detect grapevine leaves image types as follows:

1.
Start with input color grapevine leaf images (512 × 512 × 3)
2.
Resize into images according to each architecture’s accepted dimension.
3.
Train the images with fivefold cross-validation using DarkNet-53- GoogleNet- InceptionResNet_v2- NasNetMobile- ResNet-18 through fine tuning parameters.
4.
Calculate performance metrics to compare algorithms.
5.
Save net.
6.
Split the images as 70% training and 30% testing.
7.
Activate the net.
8.
DarkNet-53- GoogleNet- InceptionResNet_v2- NasNetMobile- ResNet-18 is utilized to reduce high dimension of images and extract features from average pooling layer.
9.
Obtain “n” features weights via Mahalanobis distance metric.
10.
Apply suggested method: ERSS-MERSS-RSS-SRS, and NCA.
11.
Take selected features.
12.
Classify with ANN to detect types of grapevine leaf images.
13.
Calculate performance metrics.
14.
Save result.
15.
Find the best structure according to performance metrics.
16.
End.

3 Results

3.1 Experimental Results

The investigation has been prepared using MATLAB 2021b. The primary objective of this study is to propose a new method for selecting features and to identify grapevine leaves types using hybrid algorithms that have been developed. First and foremost, the dataset is acquired from the public website, as indicated in the dataset section. The following models, namely DarkNet-53, GoogleNet, NasNetMobile, InceptionResNet_v2, and ResNet-18, are utilized as automatic feature extractors using a fivefold cross-validation and transfer learning approach, each model being evaluated individually. In the subsequent procedure, we derive feature weights from the final layer of the relevant architecture. Subsequently, the features are chosen using recommended techniques: ERSS, MERSS, RSS, and SRS. In the meantime, we employ ANN to classify all the selected features. Table 4 demonstrates the performance of these pre-trained architectures in classifying grapevine leaf images through the utilization of fivefold cross-validation.

Table 4 Performance of pre-trained architectures on grapevine leaf images using five fold cross-validation

Full size table

Based on the data presented in Table 4, it can be observed that the utilization of pre-trained architectures has resulted in highly successful performances. The empirical findings suggest that DarkNet-53 achieves the most optimal performance, exhibiting an accuracy rate of 96.20% during this particular stage. The second model is ResNet-18, which exhibits a commendable accuracy of 95%. Following the series of performances, InceptionResNet_v2, NASNetMobile, and GoogleNet exhibit accuracies of 89.8%, 88.2%, and 86% in that order.

Despite the satisfactory nature of the performances, we employ the feature selection technique and machine learning algorithm to ascertain the credibility of the research findings. In the current phase of this study, we have partitioned the dataset into a 70% training subset and a 30% testing subset subsequent to training the dataset using the relevant architecture. Subsequently, we extract features from the last layer (which varies depending on the specific architecture) and subsequently employ our proposed methodologies, namely ERSS, MERSS, RSS, and SRS, to select the desired features. The classification stage has commenced using an ANN, and we have evaluated all hybrid structures. The outcomes of these evaluations are documented in Tables 5, 6, 7, 8, 9.

Table 5 Hybrid algorithm performance with DarkNet53, suggested methods, and ANN

Full size table

Table 6 Hybrid algorithm performance with GoogleNet, suggested methods, and ANN

Full size table

Table 7 Hybrid algorithm performance via NasNetMobile, suggested methods, and ANN

Full size table

Table 8 Hybrid algorithm performance using InceptionResNet, suggested methods, and ANN

Full size table

Table 9 Hybrid algorithm performance using ResNet-18, suggested methods, and ANN

Full size table

When extracting features from DarkNet-53, it has been observed that the global average pooling layer, commonly referred to as 'gap', proves to be beneficial. A total of 1024 features are acquired from the layer, and a subset of 36 features is selected using the ERSS, RSS, and SRS techniques. Furthermore, a total of 45 significant features have been determined using the MERSS method. The output is presented in Table 5.

Based on the findings presented in Table 5, the application of our proposed methodologies yields efficient results. It is worth noting that the performances obtained are in the form of test results. We are pleased to report that these results exhibit a high level of confidence in accurately identifying different types of grapevine leaf. The DarkNet53-MERSS-ANN algorithm has achieved the highest test accuracy of 97.33% along with other metrics. Additionally, the kappa value approaches 1, indicating that the algorithm can be considered highly successful. Furthermore, the performance is enhanced with the proposed methodology. In the previous iteration, the DarkNet-53 model demonstrated a classification accuracy of 96.20%. Furthermore, the algorithm employed in this study yields the utmost test accuracy when tasked with classifying grapevine leaf images. Additionally, Fig. 3 depicts the confusion matrix obtained from the DarkNet53-MERSS-ANN algorithm. Figure 4 displays the ANN training accuracy and loss graph after the feature selection process.

The following model in consideration is GoogleNet. If features are extracted from the GoogleNet model, the average pooling layer, specifically named 'pool5-7 × 7_s1', is observed to be useful. Indeed, a total of 1024 features are extracted from the layer, and subsequently, a selection process is performed using ERSS, RSS, and SRS techniques to identify the most significant 36 features. Furthermore, a total of 45 significant features have been identified using the MERSS method. The output of the computation is displayed in Table 6.

Table 6 indicates that the GoogleNet-RSS-ANN algorithm, with a test accuracy of 95.33%, is the most accurate of the methods we have proposed. The values for the sensitivity, G-Mean, F-measure, kappa value, and AUC are 95.33%, 97.07%, 0.9533, 0.8542, and 0.9946, respectively. In addition, the kappa value is close to 1, which indicates that the algorithm is quite successful. Moreover, the performance is enhanced by the suggested method. Previously, the accuracy of GoogleNet was 86%.

The model is referred to as NasNetMobile. If it's desired to collect features from NasNetMobile, it is possible to employ a global average pooling layer, denoted as 'global_average_pooling2d_1'. A total of 1056 features have been extracted from the layer, and a subset of 36 features is selected using the ERSS, RSS, and SRS methods. Additionally, the MERSS method is utilized to identify a total of 45 crucial features. The outcomes are presented in Table 7.

Table 7 shows that the NasNetMobile-MERSS-ANN algorithm outperforms our suggested methods, with a test accuracy of 79.33%. Furthermore, its sensitivity, G-Mean, F-measure, kappa value, and AUC are 79.33%, 86.74%, 0.7894, 0.3542, and 0.9375, respectively. When the kappa value is compared to one, it indicates that the algorithm is not preferable. NasNetMobile previously had an accuracy of 88.20%.

The one after that is InceptionResNet_v2. If features are obtained from it, a global average pooling layer known as 'avg-pool' is found to be useful. Essentially, 1536 features are extracted from the layer, with significant 53 features selected using the ERSS, RSS, and SRS methods. The MERSS method also identifies 55 essential features. Table 8 displays all of the results.

Table 8 shows that when using our recommended techniques, two algorithms—InceptionResNet-ERSS-ANN and InceptionResNet-MERSS-ANN—perform best, with test accuracy of 92.67%. Additionally, they achieve the same ratio for their sensitivity, G-Mean, F-measure, kappa value, and AUC, which are 92.67%, 95.38%, 0.9266, 0.7708, and 0.9953, respectively. InceptionResNet previously achieved an accuracy of 89.90%.

ResNet-18 is the last. If features are taken from it, the 'pool5' average pooling layer is discovered to be useful. In essence, 512 features from the layer are collected, and the most significant 24 features are chosen using the ERSS, RSS, and SRS methods. 32 significant features are also found using the MERSS method. Table 9 displays the complete results.

Analyzing our suggested methods, Table 9 shows that ResNet18-MERSS-ANN performs the best, with a test accuracy of 85.33%. Additionally, its sensitivity, G-Mean, F-measure, kappa value, and AUC are all achieved at respective levels of 85.33%, 90.67%, 0.8531, 0.5417, and 0.9761. Prior to this, ResNet18 has a 95% accuracy rate.

In recent years, numerous researchers have employed various feature selection methods in their studies [1, 6, 83]. However, the process of identifying the suitable feature selection is not straightforward, as certain methods may rely on underlying assumptions. This study proposes several practical methods and conducts a comparison with neighborhood component analysis (NCA), a nonparametric method that operates without making any assumptions. The results of applying various combinations of NCA are presented in Table 10

Table 10 Performance of hybrid algorithm based on NCA and ANN combination

Full size table

Based on the findings presented in Table 10, it can be observed that DarkNet53-NCA-ANN emerges as the most effective feature selection method, exhibiting a notable accuracy rate of 96.67%. In addition, the sensitivity, G-Mean, F-measure, kappa value, and AUC of the system were determined to be 96.67%, 97.90%, 0.9664, 0.8958, and 1.00, respectively. The confusion matrix of the DarkNet53-NCA-ANN model is depicted in Fig. 5.

DarkNet-53 architecture is employed as a deep feature extractor for grapevine leaf images. To enhance the performance of the feature extraction process, the most effective feature selection method, known as MERSS, is utilized. Through the application of MERSS on the extracted features, a notable accuracy rate of 97.33% is achieved. In contrast to NCA, MERSS demonstrates superior performance. Hence, it can be asserted that the feature selection method we have developed exhibits the highest level of performance.

The deep feature extractor GoogleNet is used for the purpose of extracting features from grapevine leaf images. The feature selection method that yields successful results is RSS, which is applied to the extracted features. This method achieves an accuracy of 95.33%. When comparing it to NCA, it exhibits inferior performance in comparison with RSS. Our model, GoogleNet, consistently outperforms other models, making it the optimal choice.

InceptionResNet_v2 is implemented as a deep feature extractor for grapevine leaf images. The successful feature selection methods employed are MERSS and ERSS, which operate on the features and achieve a hit accuracy of 92.67%. When comparing it to NCA, it performs worse than MERSS and ERSS. Our model, InceptionResNet_v2, is the best one to use.

NasNetMobile operates as a deep feature extractor for grapevine leaf images. The top feature selection method used is MERSS, which is applied to the extracted features. This method resulted in an accuracy of 79.33%. When comparing it to NCA, MERSS exhibits better performance. In conclusion, when utilizing NasNetMobile, our solution proves to be the most effective once again.

ResNet18 is utilized as a deep feature extractor for grapevine leaf images. The feature selection method employed was MERSS, which was applied to the extracted features. This approach resulted in an accuracy of 85.33%. When comparing it to NCA, MERSS has lower performance. Once again, when utilizing ResNet-18, we eventually have the best one. Based on the results, it can be concluded that MERSS performs well after implementing deep feature extractors. Additionally, it has been mentioned earlier that MERSS is superior to NCA.

4 Discussion

Some advantages and disadvantages are discussed in this section of the study. The following are the primary benefits of the study:

(i)
Extensive comparisons are made using pre-trained architectures such as DarkNet-53, GoogleNet, NasNetMobile, InceptionResNet_v2, and ResNet-18.
(ii)
To achieve confidential results, each architecture uses fivefold cross-validation, and all results are measured using accuracy, sensitivity, G-Mean, F1-measure, kappa value, and AUC.
(iii)
To improve classification performance on grapevine leaf images, pre-trained architectures are used as automatic deep feature extractors from the final layers (pooling layers are used). This is critical because expert opinions are not required.
(iv)
To reduce dimensions and select significant features, we propose novel sampling theory-based methods that ensure reliable study results.
(v)
Finally, ANN is an excellent classification algorithm with 100 hidden layers and each layer has 5 neurons for detecting different types of grapevine leaf images. In addition, to evaluate the proposed methods, we compared them to NCA, a widely used feature selection method. Our proposed method outperforms NCA.

The following are the study's drawbacks:

(i)
Grapevine leaf images are restricted and only investigated as a balanced dataset

5 Conclusion

Deep learning implementations are getting better and better today. This reflection can be seen with the naked eye and intensifies near vegetation. With this viewpoint, we have successfully extracted features from images of grapevine leaves in order to categorize the species of these leaves using pre-trained architectures. First, using fivefold cross-validation, DarkNet-53, GoogleNet, InceptionResNet_v2, NasNetMobile, and ResNet-18 are used to directly classify images of grapevine leaves. The accuracy of DarkNet-53 in this section of the study is 96.20%. Although the results are excellent, we look into how they can be improved and offer new feature selection techniques based on sampling theory. Pre-trained architectures are used in the following section of this study as feature extractors, and their final average pooling layer automatically incorporates features from images. All of these features, though, are not crucial details for images. We recommend SRS, RSS, ERSS, and MERSS as four feature selection methods to choose significant features. Additionally, using methodology, we determine how many features should be chosen. It allows for effective classification using a minimal set of features. Additionally, ANN is used to classify these features. In brief, the following outcomes are attained:

DarkNet-53 is used as a deep feature extractor, and the MERSS feature selection method and ANN classifier yield a maximum accuracy of 97.33%. While GoogleNet is used as a deep feature extractor, the RSS feature selection method and ANN classifier are used to achieve the highest accuracy, which is 95.33%. Better accuracy is attained as 92.67% when InceptionResNet_v2 is used as a deep feature extractor, providing the RSS and MERSS feature selection methods and ANN classifier. The accuracy increases to 79.33% when NasNetMobile is used as a deep feature extractor in combination with the MERSS feature selection method and ANN classifier. As a deep feature extractor, ResNet-18 is implemented, but using the MERSS feature selection method and ANN classifier, accuracy is improved to 85.33%. After all, a comparison with NCA has been made, and we can state that our suggested methods, specifically the MERSS method, are superior to it under comparable circumstances. As a result, the performance is effectively improved by the planned hybrid algorithms, and it can be inferred from explanatory methods that the results are reliable. Using DarkNet53-MERSS-ANN, the study's best performance is achieved with an accuracy of 97.33% when identifying different grapevine leaf types from images. Finally, we can affirm that the structure we created performs superbly.

5.1 What Happens in the Next Study?

This study demonstrates how pre-trained architectures can identify different plant species from images. Therefore, automatic species identification for experts, farmers, and researchers is provided by image classification of plants. However, because of the suggested algorithm, people are not spending a lot more time identifying plant species. We can say that the created structure is capable of performing and being advanced in plant detection. We will use the suggested feature selection techniques on a variety of datasets from various fields in our upcoming work.

References

Koklu, M.; Unlersen, M.F.; Ozkan, I.A.; Aslan, M.F.; Sabanci, K.: A CNN-SVM study based on selected deep features for grapevine leaves classification. Measurement 188, 110425 (2022)
Article Google Scholar
Saglam, H.; Saglam, Ö.C.: A historical review on Turkish viticulture; the importance of viticulture genetic resources. Selcuk J. Agric. Food Sci. 32(3), 601–606 (2018)
MathSciNet Google Scholar
Cangi, R.; YAĞCI, A.: Bağdan sofraya yemeklik asma yaprak üretimi. Nevşehir Bilim ve Teknoloji Dergisi 6, 137–148 (2017)
Article Google Scholar
Göktürk, N.; ARTIK, N.; Yavaş, İ.; Fidan, Y.: Bazı üzüm çeşitleri ve asma anacı yapraklarının yaprak konservesi olarak değerlendirilme olanakları üzerinde bir araştırma. Gıda, 22(1) (1997)
Ozaltin, O.; Yeniay, O.; Subasi, A.: Artificial intelligence-based brain hemorrhage detection. In: Accelerating strategic changes for digital transformation in the healthcare industry, pp. 179–199. Elsevier (2023)
Chapter Google Scholar
Ozaltin, O.; Coskun, O.; Yeniay, O.; Subasi, A.: Classification of brain hemorrhage computed tomography images using OzNet hybrid algorithm. Int. J. Imaging Syst. Technol. 33(1), 69–91 (2023)
Article Google Scholar
Ozaltin, O.; Yeniay, O.: Detection of monkeypox disease from skin lesion images using Mobilenetv2 architecture. Commun. Fac. Sci. Univ Ankara Ser A1 Math Stat. 72(2): pp. 482–499 (2023)
Google Scholar
Singh, D.; Taspinar, Y.S.; Kursun, R.; Cinar, I.; Koklu, M.; Ozkan, I.A.; Lee, H.-N.: Classification and analysis of pistachio species with pre-trained deep learning models. Electronics 11(7), 981 (2022)
Article Google Scholar
Karadal, C.H.; Kaya, M.C.; Tuncer, T.; Dogan, S.; Acharya, U.R.: Automated classification of remote sensing images using multileveled MobileNetV2 and DWT techniques. Expert Syst. Appl. 185, 115659 (2021)
Article Google Scholar
Kasmaiee, S.; Tadjfar, M.; Kasmaiee, S.: Optimization of blowing jet performance on wind turbine airfoil under dynamic stall conditions using active machine learning and computational intelligence. Arab. J. Sci. Eng. 49(2), 1771–1795 (2024)
Article Google Scholar
Habibi, O.; Chemmakha, M.; Lazaar, M.: Performance evaluation of CNN and pre-trained models for malware classification. Arab. J. Sci. Eng. 48(8), 10355–10369 (2023)
Article Google Scholar
Unal, Y.; Taspinar, Y.S.; Cinar, I.; Kursun, R.; Koklu, M.: Application of pre-trained deep convolutional neural networks for coffee beans species detection. Food Anal. Methods 15(12), 3232–3243 (2022)
Article Google Scholar
Adeel, A.; Khan, M.A.; Akram, T.; Sharif, A.; Yasmin, M.; Saba, T.; Javed, K.: Entropy-controlled deep features selection framework for grape leaf diseases recognition. Exp. Syst. 39(7), e12569 (2022)
Article Google Scholar
Nguyen, C.; Sagan, V.; Maimaitiyiming, M.; Maimaitijiang, M.; Bhadra, S.; Kwasniewski, M.T.: Early detection of plant viral disease using hyperspectral imaging and deep learning. Sensors 21(3), 742 (2021)
Article Google Scholar
Hazirbas, C.; Ma, L.; Domokos, C.; Cremers, Fusenet, D.: Incorporating depth into semantic segmentation via fusion-based cnn architecture. In: Asian Conference on Computer vision. Springer (2016)
Redmon, J.; Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint (2018). arXiv:1804.02767
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)
Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A.A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Zoph, B.; Vasudevan, V.; Shlens, J.; Le, Q.V.: Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
He, K.; Zhang, X.; Ren, S.; Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Goldberger, J.; Hinton, G.E.; Roweis, S.; Salakhutdinov, R.R.: Neighbourhood components analysis. Adv. Neural Inf. Process. Syst., 17 (2004)
Liu, Y.; Durlofsky, L.J.: 3D CNN-PCA: A deep-learning-based parameterization for complex geomodels. Comput. Geosci. 148, 104676 (2021)
Article Google Scholar
Hussein, M.; Özyurt, F.: A new technique for sentiment analysis system based on deep learning using Chi-Square feature selection methods. Balkan J. Electrical Comput. Eng. 9(4), 320–326 (2021)
Google Scholar
Toğaçar, M.; Ergen, B.; Cömert, Z.; Özyurt, F.: A deep feature learning model for pneumonia detection applying a combination of mRMR feature selection and machine learning models. Irbm 41(4), 212–222 (2020)
Article Google Scholar
Kasmaiee, S.; Tadjfar, M.; Kasmaiee, S.; Ahmadi, G.: Linear stability analysis of surface waves of liquid jet injected in transverse gas flow with different angles. Theor. Comput. Fluid Dyn., pp. 1–32 (2024)
Li, Y.; Li, T.; Liu, H.: Recent advances in feature selection and its applications. Knowl. Inf. Syst. 53, 551–577 (2017)
Article Google Scholar
Pan, B.; Liu, C.; Su, B.; Ju, Y.; Fan, X.; Zhang, Y.; Sun, L.; Fang, Y.; Jiang, J.: Research on species identification of wild grape leaves based on deep learning. Sci. Hortic. 327, 112821 (2024)
Article Google Scholar
Lilhore, U.K.; Imoize, A.L.; Lee, C.-C.; Simaiya, S.; Pani, S.K.; Goyal, N.; Kumar, A.; Li, C.-T.: Enhanced convolutional neural network model for cassava leaf disease identification and classification. Mathematics 10(4), 580 (2022)
Article Google Scholar
Atila, Ü.; Uçar, M.; Akyol, K.; Uçar, E.: Plant leaf disease classification using EfficientNet deep learning model. Eco. Inform. 61, 101182 (2021)
Article Google Scholar
Tiwari, V.; Joshi, R.C.; Dutta, M.K.: Dense convolutional neural networks based multiclass plant disease detection and classification using leaf images. Eco. Inform. 63, 101289 (2021)
Article Google Scholar
Ahila Priyadharshini, R.; Arivazhagan, S.; Arun, M.; Mirnalini, A.: Maize leaf disease classification using deep convolutional neural networks. Neural Comput. Appl. 31(12), 8887–8895 (2019)
Article Google Scholar
Azim, M.A.; Islam, M.K.; Rahman, M.M.; Jahan, F.: An effective feature extraction method for rice leaf disease classification. Telecommun. Comput. Electronics Control TELKOMNIKA 19(2), 463–470 (2021)
Google Scholar
Sembiring, A.; Away, Y.; Arnia, F.; Muharar, R.: Development of concise convolutional neural network for tomato plant disease classification based on leaf images. In: Journal of Physics: Conference Series. IOP Publishing (2021)
Zhang, S.; Wu, X.; You, Z.; Zhang, L.: Leaf image based cucumber disease recognition using sparse representation classification. Comput. Electron. Agric. 134, 135–141 (2017)
Article Google Scholar
Sladojevic, S.; Arsenovic, M.; Anderla, A.; Culibrk, D.; Stefanovic, D.: Deep neural networks based recognition of plant diseases by leaf image classification. Comput. Intell. Neurosci. (2016)
Kan, H.; Jin, L.; Zhou, F.: Classification of medicinal plant leaf image based on multi-feature extraction. Pattern Recognit Image Anal. 27(3), 581–587 (2017)
Article Google Scholar
Dudi, B.; Rajesh, V.: Optimized threshold-based convolutional neural network for plant leaf classification: a challenge towards untrained data. J. Comb. Optim. 43(2), 312–349 (2022)
Article MathSciNet Google Scholar
Hridoy, R.H.; Habib, T.; Rahman, S.; Uddin, M.S.: Deep neural networks-based recognition of betel plant diseases by leaf image classification. In: Evolutionary Computing and Mobile Sustainable Networks, pp. 227–241. Springer (2022)
Chapter Google Scholar
Tiwari, V.; Joshi, R.C.; Dutta, M.K.: Deep neural network for multi‐class classification of medicinal plant leaves. Exp. Syst., p. e13041 (2022)
Ruth, J.A.; Uma, R.; Meenakshi, A.; Ramkumar, P.: Meta-heuristic based deep learning model for leaf diseases detection. Neural Process. Lett., pp. 1–17 (2022)
Paymode, A.S.; Malode, V.B.: Transfer learning for multi-crop leaf disease image classification using convolutional neural network VGG. Artif. Intell. Agric. 6, 23–33 (2022)
Google Scholar
Reddy, S.R.; Varma, G.; Davuluri, R.L.: Deep neural network (DNN) mechanism for identification of diseased and healthy plant leaf images using computer vision. Ann. Data Sci., pp. 1–30 (2022)
Ganguly, S.; Bhowal, P.; Oliva, D.; Sarkar, R.: BLeafNet: a Bonferroni mean operator based fusion of CNN models for plant identification using leaf image classification. Eco. Inform. 69, 101585 (2022)
Article Google Scholar
Chen, H.-C.; Widodo, A.M.; Wisnujati, A.; Rahaman, M.; Lin, J.C.-W.; Chen, L.; Weng, C.-E.: AlexNet convolutional neural network for disease detection and classification of tomato leaf. Electronics 11(6), 951 (2022)
Article Google Scholar
Arun, Y. Viknesh, G.: Leaf classification for plant recognition using efficientnet architecture. In: 2022 IEEE Fourth International Conference on Advances in Electronics, Computers and Communications (ICAECC). IEEE (2022)
Bhujel, A.; Kim, N.-E.; Arulmozhi, E.; Basak, J.K.; Kim, H.-T.: A lightweight Attention-based convolutional neural networks for tomato leaf disease classification. Agriculture 12(2), 228 (2022)
Article Google Scholar
Wei, K.; Chen, B.; Zhang, J.; Fan, S.; Wu, K.; Liu, G.; Chen, D.: Explainable deep learning study for leaf disease classification. Agronomy 12(5), 1035 (2022)
Article Google Scholar
Saberi Anari, M.; A hybrid model for leaf diseases classification based on the modified deep transfer learning and ensemble approach for agricultural aiot-based monitoring. Comput. Intell. Neurosci. (2022)
Wang, X.-F.; Huang, D.-S.; Du, J.-X.; Xu, H.; Heutte, L.: Classification of plant leaf images with complicated background. Appl. Math. Comput. 205(2), 916–926 (2008)
MathSciNet Google Scholar
Koklu, M.; Sarigil, S.; Ozbek, O.: The use of machine learning methods in classification of pumpkin seeds (Cucurbita pepo L.). Genetic Resources Crop Evol., 68(7), 2713–2726 (2021)
Bodor-Pesti, P.; Taranyi, D.; Deák, T.; Nyitrainé Sárdy, D.Á.; Varga, Z.: A review of ampelometry: morphometric characterization of the grape (Vitis spp.) leaf. Plants 12(3), 452 (2023)
Shi, Y.; Yang, K.; Jiang, T.; Zhang, J.; Letaief, K.B.: Communication-efficient edge AI: Algorithms and systems. IEEE Commun. Surv. Tutorials 22(4), 2167–2191 (2020)
Article Google Scholar
Ozaltin, O.; Yeniay, O.: A novel proposed CNN–SVM architecture for ECG scalograms classification. Soft. Comput. 27(8), 4639–4658 (2023)
Article Google Scholar
Özaltın, Ö.; Yeniay, Ö.: Ecg classification performing feature extraction automatically using a hybrid cnn-svm algorithm. In: 2021 3rd International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA). IEEE (2021)
Butuner, R.; Cinar, I.; Taspinar, Y.S.; Kursun, R.; Calp, M.H.; Koklu, M.: Classification of deep image features of lentil varieties with machine learning techniques. Eur. Food Res. Technol. 249(5), 1303–1316 (2023)
Article Google Scholar
Pathak, D.; Raju, U.: Content-based image retrieval using feature-fusion of groupnormalized-inception-darknet-53 features and handcraft features. Optik 246, 167754 (2021)
Article Google Scholar
Kamble, R.M.; Chan, G.C.; Perdomo, O.; Kokare, M.; Gonzalez, F.A.; Müller, H.; Mériaudeau, F.: Automated diabetic macular edema (DME) analysis using fine tuning with inception-resnet-v2 on OCT images. In: 2018 IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES). IEEE (2018)
Wang, J.; He, X.; Faming, S.; Lu, G.; Cong, H.; Jiang, Q.: A real-time bridge crack detection method based on an improved inception-resnet-v2 structure. IEEE Access 9, 93209–93223 (2021)
Article Google Scholar
Addagarla, S.K.; Chakravarthi, G.K.; Anitha, P.: Real time multi-scale facial mask detection and classification using deep transfer learning techniques. Int. J. 9(4), 4402–4408 (2020)
Google Scholar
Chandola, Y.; Virmani, J.; Bhadauria, H.; Kumar, P.: Chapter 4—deep learning for chest radiographs: computer-aided classification. Elsevier (2021)
McIntyre, G.: A method for unbiased selective sampling, using ranked sets. Aust. J. Agric. Res. 3(4), 385–390 (1952)
Article Google Scholar
Zamanzade, E.; Mahdizadeh, M.: Using ranked set sampling with extreme ranks in estimating the population proportion. Stat. Methods Med. Res. 29(1), 165–177 (2020)
Article MathSciNet Google Scholar
Bouza-Herrera, C.N.; Al-Omari, A.I.F.: Ranked set sampling: 65 years improving the accuracy in data gathering. Academic Press (2018)
Koyuncu, N.; Al-Omari, A.I.: Generalized robust-regression-type estimators under different ranked set sampling. Math. Sci. 15(1), 29–40 (2021)
Article MathSciNet Google Scholar
Djouzi, K.; Beghdad-Bey, K.; Amamra, A.: A new adaptive sampling algorithm for big data classification. J. Comput. Sci. 61, 101653 (2022)
Article Google Scholar
Rendon, E.; Alejo, R.; Castorena, C.; Isidro-Ortega, F.J.; Granda-Gutierrez, E.E.: Data sampling methods to deal with the big data multi-class imbalance problem. Appl. Sci. 10(4), 1276 (2020)
Article Google Scholar
Samawi, H.M.; Yu, L.; Rochani, H.; Vogel, R.: Reducing sample size needed for cox-proportional hazards model analysis using more efficient sampling method. Commun. Stat. Theory Methods 49(6), 1281–1298 (2020)
Article MathSciNet Google Scholar
Al-Odat, M.T.; Al-Saleh, M.F.: A variation of ranked set sampling. J. Appl. Statist. Sci. 10(2), 137–146 (2001)
MathSciNet Google Scholar
McCulloch, W.S.; Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5(4), 115–133 (1943)
Article MathSciNet Google Scholar
Kasmaiee, S.; Tadjfar, M.; Kasmaiee, S.: Machine learning-based optimization of a pitching airfoil performance in dynamic stall conditions using a suction controller. Phys. Fluids, 35(9) (2023)
Kasmaiee, S.; Kasmaiee, S.; Homayounpour, M.: Correcting spelling mistakes in Persian texts with rules and deep learning methods. Sci. Rep. 13(1), 19945 (2023)
Article Google Scholar
Agatonovic-Kustrin, S.; Beresford, R.: Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. J. Pharm. Biomed. Anal. 22(5), 717–727 (2000)
Article Google Scholar
Jiang, H.: Machine Learning Fundamentals: A Concise Introduction. Cambridge University Press (2021)
Sharifrazi, D.; Alizadehsani, R.; Roshanzamir, M.; Joloudari, J.H.; Shoeibi, A.; Jafari, M.; Hussain, S.; Sani, Z.A.; Hasanzadeh, F.; Khozeimeh, F.: Fusion of convolution neural network, support vector machine and Sobel filter for accurate detection of COVID-19 patients using X-ray images. Biomed. Signal Process. Control 68, 102622 (2021)
Article Google Scholar
Rajinikanth, V.; Joseph Raj, A.N.; Thanaraj, K.P.; Naik, G.R.: A customized VGG19 network with concatenation of deep and handcrafted features for brain tumor detection. Appl. Sci. 10(10), 3429 (2020)
Article Google Scholar
Wang, J.; Yang, Y.; Xia, B.: A simplified Cohen’s Kappa for use in binary classification data annotation tasks. IEEE Access 7, 164386–164397 (2019)
Article Google Scholar
Gao, J.; Lanchantin, J.; Soffa, M.L.; Qi, Y.: Black-box generation of adversarial text sequences to evade deep learning classifiers. In: 2018 IEEE Security and Privacy Workshops (SPW). IEEE (2018)
Subasi, A.: Medical decision support system for diagnosis of neuromuscular disorders using DWT and fuzzy support vector machines. Comput. Biol. Med. 42(8), 806–815 (2012)
Article Google Scholar
Lopez-del Rio, A.; Nonell-Canals, A.; Vidal, D.; Perera-Lluna, A.: Evaluation of cross-validation strategies in sequence-based binding prediction using deep learning. J. Chem. Inf. Model. 59(4), 1645–1657 (2019)
Article Google Scholar
Saber, A.; Sakr, M.; Abo-Seida, O.M.; Keshk, A.; Chen, H.: A novel deep-learning model for automatic detection and classification of breast cancer using the transfer-learning technique. IEEE Access 9, 71194–71209 (2021)
Article Google Scholar
Koklu, M.; Ozkan, I.A.: Multiclass classification of dry beans using computer vision and machine learning techniques. Comput. Electron. Agric. 174, 105507 (2020)
Article Google Scholar
Arlot, S.; Celisse, A.: A survey of cross-validation procedures for model selection. Stat. Surv. 4, 40–79 (2010)
Article MathSciNet Google Scholar
Chandrashekar, G.; Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)
Article Google Scholar

Download references

Funding

Open access funding provided by the Scientific and Technological Research Council of Türkiye (TÜBİTAK).

Author information

Authors and Affiliations

Department of Statistics, Faculty of Science, Ataturk University, 25240, Erzurum, Turkey
Öznur Özaltın
Department of Statistics, Faculty of Science, Hacettepe University, Beytepe Campus, 06800, Ankara, Turkey
Nursel Koyuncu

Authors

Öznur Özaltın
View author publications
You can also search for this author in PubMed Google Scholar
Nursel Koyuncu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

OO contributed to the idealization, methodology, software, analysis, writing, review and editing. NK was involved in the idealization, methodology, software, analysis, writing, review and editing.

Corresponding author

Correspondence to Öznur Özaltın.

Ethics declarations

Conflict of interest

The authors declare that they do not have competing financial interests in this paper.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Özaltın, Ö., Koyuncu, N. A Novel Feature Selection Approach-Based Sampling Theory on Grapevine Images Using Convolutional Neural Networks. Arab J Sci Eng (2024). https://doi.org/10.1007/s13369-024-09192-2

Download citation

Received: 10 January 2024
Accepted: 16 May 2024
Published: 25 June 2024
DOI: https://doi.org/10.1007/s13369-024-09192-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Novel Feature Selection Approach-Based Sampling Theory on Grapevine Images Using Convolutional Neural Networks

Abstract

Similar content being viewed by others

Active Learning for Crop-Weed Discrimination by Image Classification from Convolutional Neural Network’s Feature Pyramid Levels

Improved Convolutional Neural Networks for Hyperspectral Image Classification

Leaf recognition using convolutional neural networks based features

1 Introduction

1.1 Novelties and Contribution of this Study

1.2 Literature Review

2 Methods

2.1 Dataset of Grapevine Leaves

2.2 Deep Feature Extractors

2.2.1 DarkNet-53

2.2.2 GoogleNet

2.2.3 InceptionResNet

2.2.4 NasNetMobile

2.2.5 ResNet-18

2.3 A Novel Feature Selection Approach Based on Sampling Theory

2.3.1 Simple Random Sampling (SRS)

2.3.2 Ranked Set Sampling (RSS) Procedure for Feature Selection

2.3.3 Extreme Ranked Set Sampling (ERSS) Procedure for Feature Selection

2.3.4 Moving Extreme Ranked Set Sampling (MERSS) Procedure for Feature Selection

2.4 Classification via Artificial Neural Network (ANN)

2.5 Performance Metrics

2.6 Cross Validation

2.7 Fine Tuning Parameters

Algorithm

3 Results

3.1 Experimental Results

4 Discussion

5 Conclusion

5.1 What Happens in the Next Study?

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation