A Novel Feature Selection Approach Based Sampling Theory on Grapevine Images using Convolutional Neural Networks

Feature selection, reducing number of input variables to develop classi�cation model, is an important process to reduce computational and modelling complexity and affects the performance of image process. In this paper, we have proposed new statistical approaches for feature selection based on sample selection. We have applied our new approaches to grapevine leaves data that possesses properties of shape, thickness, featheriness, and slickness are investigated in images. To analyze such kind of data by using image process, thousands of features are created and selection of features plays important role to predict the outcome properly. In our numerical study, Convolutional Neural Networks (CNNs) have been used as feature extractors and then obtained features from the last average pooling layer to detect the type of grapevine leaves from images. These features have been reduced by using our suggested four statistical methods: Simple random sampling (SRS), ranked set sampling (RSS), extreme ranked set sampling (ERSS), Moving extreme ranked set sampling (MERSS). Then selected features have been classi�ed with Arti�cial Neural Network (ANN) and we have obtained the best accuracy of 97.33% with our proposed approaches. Based on our empirical analysis, it has been determined that the proposed approach exhibits e�cacy in the classi�cation of grapevine leaf types. Furthermore, it possesses the potential for integration into various computational devices.


Introduction
The annual grapevine leaves harvest yields an additional product for the eld of agriculture especially in Turkey.The classi cation of grapevine leaves holds signi cance with regards to their valuation and avor characteristics (Koklu et al., 2022, Saglam andSaglam, 2018).Various kind of grapevine leaves exhibit distinct leaf attributes, encompassing criteria such as shape, depth, length, featheriness, and sliciness, which display considerable variation (Cangi and Yağci, 2017, Göktürk et al., 1997, Koklu et al., 2022).Due to this reason, the grapevine leaves of every variant are not utilized for culinary purposes.Consumers exhibit a lack of preference towards leaves that possess a substantial thickness, with feathers, and are sliced.The optimal choice for culinary purposes is a grapevine leaves cultivar that possesses a slender morphology, devoid of feathers, exhibiting delicate venation, sliced to the most thinness, and imparting a tangy sensation upon the tasting receptors (Koklu et al., 2022).Hence, the categorization of consumable species from non-consumable grapevine leaves species and the identi cation of them based on leaf and fruit images are crucial prerequisites in this domain.However, for individuals lacking specialized knowledge, discerning the type of grapevine leaves poses a considerable challenge (Koklu et al., 2022).

Deep learning algorithms have been utilized to develop prediction in classi cation models. Convolutional
Neural Networks (CNNs) are a variant of deep learning algorithms and they are widely used for image classi cation or prediction in many elds (Ozaltin et al., 2023a-2023b, Ozaltin and Yeniay, 2023, Singh et al., 2022, Karadal et al., 2021).However, it is important to note that CNNs by themselves may not yield optimal levels of accuracy every time.CNNs possess the capability to perform automated feature extraction without the need for manual crafting and thus, researchers bene t from this property of CNNs (Adeel et al., 2022, Koklu et al., 2022, Nguyen et al., 2021).
In this study, we prefer to use pre-trained architectures as opposed to developing novel CNNs due to encountered challenges (Hazirbas et al., 2016).The pre-trained architectures selected including DarkNet-53 (Redmon and Farhadi, 2018), GoogleNet (Szegedy et al., 2015), InceptionResNet_v2 (Szegedy et al., 2017), NASNetMobile (Zoph et al., 2018), and ResNet-18 (He et al., 2016).These choices are made due to their frequent utilization in the eld.The appointed performs of these architectures involve serving as deep feature generators, wherein their nal layer produces features of varying sizes.However, a subset of these features exhibits noise, and employing the entire set results in increased computational complexity.
The task of determining the signi cance of features and performing feature reduction is a highly crucial problem.In order to achieve optimal outcomes in image detection, it is advantageous to leverage feature selection techniques and machine learning algorithms.Various feature selection methods can be employed for this purpose in the literature, including Neighborhood Component Analysis (NCA) (Goldberger et al., 2004), Principle Component Analysis (PCA) (Liu and Durlofsky, 2021), Chi-Square (Hussein and Özyurt, 2021), minimum redundancy maximum relevance (mRMR) (Toğaçar et al., 2020) and others.Nevertheless, during the application of these feature selection methods, researchers may face challenges related to computational complexity, assumptions, or time-consuming processes.In this study, we focus on overcoming these issues, particularly the challenges associated with computational complexity.The objective of this paper is to propose novel methodologies for feature selection utilizing sampling theory (SRS, RSS, ERSS, and MERSS) and analyze their impact on the performance of classi cation models.In order to conduct a comparative analysis between our proposed and existing method, we utilize the grapevine leaves dataset.Experimental results shows that our proposed method is superior than the others.Then, reduced features using our proposed method are classi ed via ANN.Therefore, our proposed approach develops a novel hybrid algorithm-based CNN, new feature selection method, and ANN.
In the pursuit of our objective, we attempt to employ based on CNNs to discern and classify grapevine leaf images, thus helping with the identi cation of plant species.

Novelties and Contribution of this Study
The main contributions and novelties of this study are as follows: Obtained original grapevine leaves images having ve classes: Ak, Ala Idris, Büzgülü, Dimnit, and Nazli from the website http://www.muratkoklu.com/datasets/Grapevine_Leaves_Image_Dataset.rarUtilized 5-fold cross-validation to obtain reliable outcomes for DarkNet-53, GoogleNet, InceptionResNet_v2, NASNetMobile, and ResNet-18.
Compared these pre-trained architectures utilizing the SoftMax layer for the purpose of classifying grapevine leaves images.
Employed these architectures for extracting features from images.Extracted features were obtained speci cally from the average pooling layer of the respective architecture.
Proposed novel feature selection methods based on sampling theory: SRS, RSS, ERSS, MERSS.
Identi ed number of features and selected important features using these methods.

Classi ed selected features through Arti cial Neural Network (ANN).
Investigated performance of classi cation on the hybrid algorithms from grapevine leaf images.
Compared these suggested methods with NCA on the performance of classi cation.
Finally, the highest accuracy is obtained by using DarkNet53-MERSS-ANN hybrid algorithm.Figure 1 shows the pipeline of this presented study.

Literature Review
In recent years, scienti c investigations have primarily concentrated on the examination of disease identi cation and species classi cation through the utilization of leaf images, as documented in current collections of literature (Lilhore et  They also used healthy leaves to distinguish themselves from the competition.In total, 10 different classes were utilized and classi ed with these architectures.Finally, the study obtained the highest accuracy of 97.15%.Zhang et al. (2017) proposed a novel approach to detecting cucumber leaf disease.Firstly, they segmented disease by using K-means clustering, and then they extracted features such as shape and color from lesion information.Lastly, they classi ed leaf images to detect disease utilizing sparse representation (SR).At the end of the study, they obtained an accuracy of 85.7% with this approach.Sladojevic et al. (2016) created a model using the CNNs algorithm to distinguish 13 different plant diseases from leaf images via Caffe.Finally, their study achieved an average precision of 96.3%.Kan et al. (2017) searched for medicinal plants, which are essential in traditional Chinese medicine, via a support vector machine (SVM).Before the classi cation stage, image features such as shape and texture are extracted for each of the 12 different leaf types.When the features are classi ed via SVM, application results achieve an average accuracy of 93.3%.Koklu et al. (2022) performed grapevine leaf image classi cation using MobileNetv2, which is one of the pre-trained convolutional neural network architectures.Their dataset consists of ve different classes and 500 grapevine leaf images.While they classify the dataset via MobileNetv2, they do not nd it su cient and then combine it with SVM to obtain the best classi cation results.Prior to this merging, the feature selection method, Chi-Square, is applied, and then which kernel is successful is investigated.At the end of the study, they express that the best kernel is Cubic with 250 features selected and an accuracy of 97.60%.
Dudi and Rajesh (2022) introduced a novel deep learning hybrid algorithm to identify leaf types.Their algorithm includes enhanced CNN with optimization methods for activation functions and hidden neurons.Their proposed method is the Shark Smell-Based Whale Optimization Algorithm (SS-WOA).Besides, they tested this hybrid algorithm on untrained and collected leaf images and obtained an accuracy of 86%.In addition to these studies, Table 1 displays state-of-the-art studies belonging to leaf image classi cation.2. Methods

Dataset of Grapevine Leaves
Plants play an important role in the world (Wang et al., 2008).In nature, there are many species of plants, and their detection is so di cult and time-consuming.Grapevine leaf is also a special plant that has different properties such as shape, thickness, featheriness, and slickness, and detecting it is quite hard.
The dataset includes grapevine leaf images with ve classes: Ak, Ala Idris, Büzgülü, Dimnit, and Nazli.In total, there are 500 images, consisting of 100 images for each class.Besides, all images have RGB (red, green, and blue) format and 512x512x3 dimensions.This dataset was created by Koklu et al. (2022) and obtained from the website http://www.muratkoklu.com/datasets/Grapevine_Leaves_Image_Dataset.rar.
In this study, we did not resize the images during the preprocessing phase manually.However, each pretrained architecture may accept different input sizes of the image.Therefore, we perform the data augmentation process automatically to resize the dimension that is accepted as the input size for each pre-trained architecture.).When desiring to classify any image using a machine learning algorithm, the features of the image are extracted manually through a process known as hand-crafting.Nonetheless, this circumstance is time-consuming and requires expert consideration.To analyze images from any eld, it is di cult to locate specialists; consequently, results cannot be obtained rapidly.The feature extraction problem can now be handled by CNNs (Ozaltin et al., 2022).
In this study, we automatically extract features from grapevine images using the networks DarkNet-53, GoogleNet, InceptionResNet_v2, NASNetMobile, and ResNet-18.Table 2 exhibits the number of parameters, the layers, the input size, and the years in which pre-trained architectures have been developed.Furthermore, the next sub-section will provide a concise presentation of these architectures.

DarkNet-53
Darknet-53, a convolutional neural network (CNN) developed by Redmon and Farhadi (2018), is the primary module for extracting features in order to identify objects within the Yolov3 network (Pathak and Raju, 2021).The architecture comprises a total of 53 deep convolutional layers, and it is denoted as DarkNet-53 due to the speci c count of these layers.Indeed, there exist repetition blocks, resulting in a total number of layers amounting to 106.The speci ed architecture is designed to accommodate an image input with dimensions of 256x256.Table 3 presents comprehensive information regarding the architectures.Furthermore, it has been observed that DarkNet-53 exhibits superior performance in the context of classi cation and extraction of features within the scope of this investigation.

ResNet-18
The architecture known as ResNet-18, as described by He et al. ( 2016), consists of a total of 72 layers, with 18 of them being deep layers.Moreover, it was developed in the year 2016.This architecture aims to e ciently provide a multitude of convolutional layers for functioning.The core principle of ResNet involves implementing skip connections, commonly referred to as shortcut connections.During this iterative process, the interconnection compresses the underlying structure, leading to accelerated learning within the network.The structure is recognized as a Directed Acyclic Graph (DAG) network due to its intricate layered con guration (Chandola et al., 2021).
These architectures have been employed for the purpose of both classi cation and the generation of deep features.When they are utilized as the classi ers, the Softmax layer is applied.However, in the case of utilizing deep feature generators, the implementation of the Softmax layer is absent, resulting in the acquisition of deep features widely from the last layers.For example, DarkNet-53 yields a feature vector of dimension 1024.However, it is necessary to decrease the number of features in order to attain optimal performance.In the context of this study, our objective is to design innovative techniques for selecting features.

A Novel Feature Selection Approach based on Sampling Theory
In this study, we aim to nd solutions to these problems, with a special emphasis on those involving computational complexity.Therefore, we present a novel feature selection methodologies SRS, RSS, ERSS, and MERSS based on sampling theory for improving the classi cation performance of grapevine leaves images.In the next sub-sections, we introduce the proposed methods, and overall Algorithm process.1. Order the weights descending order by an inexpensive method.
2. Select "n" features of size n respectively called sets.
3. Measure accurately the rst ordered feature from the rst set, the second ordered feature from the second set.The process continues in this way until the maximum ordered feature from the last n-th set is measured.
Note that we can select "Integer (sqrt (weight_size)" size of feature with this procedure.

Extreme Ranked Set Sampling (ERSS) Procedure for Feature Selection
When the set size n is large, RSS may have ranking errors.As an attempt to overcome this problem several variations of RSS have been proposed by researchers.The main idea of ERSS is that the identi cation of the maximum rank is much easier than the determination of the all ranks (Samawi et al., 2020).We can de ne a new procedure called "ERSS Procedure" for feature selection as: 1. Order the weights descending order by an inexpensive method.
2. Select "n" features of size n respectively called sets.
3. Measure accurately the maximum ordered feature from the rst set, the maximum ordered feature from the second set.The process continues in this way until the maximum ordered feature from the last n-th set is measured.
Note that we can select same feature size with RSS and unlike the classical ERSS de ned by Samawi et al. (2020), in our approach we only take into account maximum values instead of minimum.1. Order the weights descending order by an inexpensive method.
3. Measure accurately the maximum ordered feature from the rst set, the maximum ordered feature from the second set.The process continues in this way until the maximum ordered feature from the last n-th set is measured.
This modi cation of RSS, in addition of being easier to execute than both usual RSS and xed size extreme RSS, keeps some of the balancing inherited in the usual RSS.Hence, it can be concluded that the MERSS algorithm exhibits superior e ciency compared to other methods in the task of feature selection, resulting in a signi cantly enhanced performance in the classi cation of grapevine leaves images.
According to our proposed feature selection methods, we select signi cant features, which are subsequently subjected to classi cation using an ANN with high e ciency.

Classi cation via Arti cial Neural Network (ANN)
Arti cial neural networks (ANN), as introduced by Mcculloch and Pitts (1943), emerged as a result of studying brain functionality and subsequently found application in computer programs (Ozaltin et al., 2023a).In addition, it is important to note that any ANN comprises numerous individual units, commonly referred to as neurons or processing elements (PE).These units are interconnected through weights, which facilitate the neural structure of the network.Furthermore, these interconnected units are typically organized in layers to ensure proper coordination and functioning of the ANN (Agatonovic-Kustrin and Beresford, 2000).
ANN originates from a prosperous lineage of non-linear algorithms.When utilized in the domain of machine learning, particularly in the context of supervised learning, the outcomes have exhibited considerable success in recent times.Additionally, arti cial neural networks (ANN) possess a exible architecture that can be effectively employed to accommodate a wide range of real-world datasets (Jiang, 2021).One may consider referring to the book authored by H. Jiang (2021) in order to acquire speci c information.
Based on the aforementioned considerations, the implementation of an ANN is carried out in this study for the purpose of e cient classi cation of grapevine leaves images.This decision is based on the fundamental principles established by Ozaltin et al. (2023a) in their study.When implementing an ANN, a con guration is chosen where there are one hundred hidden layers followed by a softmax layer.Furthermore, ReLU activation layer is employed to enhance the e ciency of the algorithm.The training solver has selected the limited-memory Broyden-Fletcher-Goldfarb-Shanno quasi-Newton algorithm.Additionally, the maximum number of iterations is set to 1000, while the learning rate, minimum gradient tolerance, and loss tolerance are assigned values of 0.02, 1e-4, and 1e-5, correspondingly.By carefull choosing these parameters, we can achieve improved classi cation performance.

Performance Metrics
In this study, to measure of algorithms' performance metrics, accuracy, area of under curve (AUC), F1measure, geometric mean (G-mean), kappa value and sensitivity are used, in addition, expressed Equation ( 1)-( 5) as follows: where The above performance metrics are widely employed to compute the performance of the classi er.In this study, not only calculated these metrics but also kappa value ( ) is acquired which algorithms performances are acceptable.If is close to 1, we can say that the results are perfect.Otherwise, if is close to 0, the results are unacceptable (Wang et al., 2019).Equation ( 6) and ( 7) are employed to evaluate as shown in Equation (8).

Cross Validation
The deep learning algorithm lacks the capability to communicate coherently using mathematical models, resulting in a lack of clarity regarding the transformation of inputs into outputs (  (Saber et al., 2021).In this algorithmic procedure, the dataset gets a process of partitioning into a set of subsets, wherein the number of subsets is randomly determined.One of these subsets is then designated as the test set, while the remaining subsets are utilized for training the structure (Koklu and Ozkan, 2020).The algorithm is iterated for k folds and evaluated using the framework proposed (Arlot and Celisse, 2010).In this study, the variable k is assigned a value of 5 to ensure reliable outcomes in the classi cation process.

Fine Tuning Parameters
In this study, the determination of ne-tuning parameters is crucial for attaining optimal outcomes and ensuring fair comparability with DarkNet-53, GoogleNet, NasNetMobile, InceptionResNet_v2, and ResNet-18.The identi ed parameters for ne-tuning are as follows: stochastic gradient descent (sgdm) is used as the optimizer, the learning rate is set to 0.0001, the maximum number of epochs is 10, the minibatch size is 8, and a constant learning rate schedule is utilized.
Following the utilization of methods and parameters, the pseudo code including the entirety of the Algorithm can be accessed below.

Algorithm
This study of procedures to detect grapevine leaves image types as follows: 15. Find the best structure according to performance metrics.

Experimental Results
The investigation has been prepared using MATLAB 2021b.The primary objective of this study is to propose a new method for selecting features and to identify grapevine leaves types using hybrid algorithms that have been developed.First and foremost, the dataset is acquired from the public website, as indicated in the dataset section.The following models, namely DarkNet-53, GoogleNet, NasNetMobile, InceptionResNet_v2, and ResNet-18, are utilized as automatic feature extractors using a 5-fold crossvalidation and transfer learning approach, each model being evaluated individually.In the subsequent procedure, we derive feature weights from the nal layer of the relevant architecture.Subsequently, the features are chosen using recommended techniques: ERSS, MERSS, RSS, and SRS.In the meantime, we employ ANN to classify all the selected features.Table 4 demonstrates the performance of these pretrained architectures in classifying grapevine leaf images through the utilization of 5-fold crossvalidation.Despite the satisfactory nature of the performances, we employ the feature selection technique and machine learning algorithm to ascertain the credibility of the research ndings.In the current phase of this study, we have partitioned the dataset into a 70% training subset and a 30% testing subset subsequent to training the dataset using the relevant architecture.Subsequently, we extract features from the last layer (which varies depending on the speci c architecture) and subsequently employ our proposed methodologies, namely ERSS, MERSS, RSS, and SRS, to select the desired features.The classi cation stage has commenced using an ANN, and we have evaluated all hybrid structures.The outcomes of these evaluations have been documented in Table 5-9.
When extracting features from DarkNet-53, it has been observed that the global average pooling layer, commonly referred to as 'gap', proves to be bene cial.A total of 1024 features are acquired from the layer, and a subset of 36 features is selected using the ERSS, RSS, and SRS techniques.Furthermore, a total of 45 signi cant features have been determined using the MERSS method.The output is presented in Table 5.Based on the ndings presented in Table 5, the application of our proposed methodologies yields e cient results.It is worth noting that the performances obtained are in the form of test results.We are pleased to report that these results exhibit a high level of con dence in accurately identifying different types of grapevine leaf.The DarkNet53-MERSS-ANN algorithm has achieved the highest test accuracy of 97.33% along with other metrics.Additionally, the kappa value approaches 1, indicating that the algorithm can be considered highly successful.Furthermore, the performance is enhanced with the proposed methodology.
In the previous iteration, the DarkNet-53 model demonstrated a classi cation accuracy of 96.20%.Furthermore, the algorithm employed in this study yields the utmost test accuracy when tasked with classifying grapevine leaf images.Additionally, Fig. 3 depicts the confusion matrix obtained from the DarkNet53-MERSS-ANN algorithm.
The following model in consideration is GoogleNet.If features are extracted from the GoogleNet model, the average pooling layer, speci cally named 'pool5-7x7_s1', is observed to be useful.Indeed, a total of 1024 features are extracted from the layer, and subsequently, a selection process is performed using ERSS, RSS, and SRS techniques to identify the most signi cant 36 features.Furthermore, a total of 45 signi cant features have been identi ed using the MERSS method.The output of the computation is displayed in Table 6.In addition, the kappa value is close to 1, which indicates that the algorithm is quite successful.Moreover, the performance is enhanced by the suggested method.Previously, the accuracy of GoogleNet was 86%.
The model is referred to as NasNetMobile.If it's desired to collect features from NasNetMobile, it is possible to employ a global average pooling layer, denoted as 'global_average_pooling2d_1'.A total of 1056 features have been extracted from the layer, and a subset of 36 feature is selected using the ERSS, RSS, and SRS methods.Additionally, the MERSS method is utilized to identify a total of 45 crucial features.The outcomes are presented in Table 7.  ResNet-18 is the last.If features are taken from it, the 'pool5' average pooling layer is discovered to be useful.In essence, 512 features from the layer are collected, and the most signi cant 24 features are chosen using the ERSS, RSS, and SRS methods.32 signi cant features are also found using the MERSS method.Table 9 displays the complete results.Based on the ndings presented in Table 10, it can be observed that DarkNet53-NCA-ANN emerges as the most effective feature selection method, exhibiting a notable accuracy rate of 96.67%.In addition, the sensitivity, G-Mean, F-measure, kappa value, and AUC of the system were determined to be 96.67%,97.90%, 0.9664, 0.8958, and 1.00, respectively.The confusion matrix of the DarkNet53-NCA-ANN model is depicted in Fig. 4.
DarkNet-53 architecture is employed as a deep feature extractor for grapevine leaf images.To enhance the performance of the feature extraction process, the most effective feature selection method, known as MERSS, is utilized.Through the application of MERSS on the extracted features, a notable accuracy rate of 97.33% is achieved.In contrast to NCA, MERSS demonstrates superior performance.Hence, it can be asserted that the feature selection method we have developed exhibits the highest level of performance.
The deep feature extractor GoogleNet is used for the purpose of extracting features from grapevine leaf images.The feature selection method that yields successful results is RSS, which is applied to the extracted features.This method achieves an accuracy of 95.33%.When comparing it to NCA, it exhibits inferior performance in comparison to RSS.Our model, GoogleNet, consistently outperforms other models, making it the optimal choice.
InceptionResNet_v2 is implemented as a deep feature extractor for grapevine leaf images.The successful feature selection methods employed are MERSS and ERSS, which operate on the features and achieve a hit accuracy of 92.67%.When comparing it to NCA, it performs worse than MERSS and ERSS.Our model, InceptionResNet_v2, is the best one to use.
NasNetMobile operates as a deep feature extractor for grapevine leaf images.The top feature selection method used is MERSS, which is applied to the extracted features.This method resulted in an accuracy of 79.33%.When comparing it to NCA, MERSS exhibits better performance.In conclusion, when utilizing NasNetMobile, our solution proves to be the most effective once again.
ResNet18 is utilized as a deep feature extractor for grapevine leaf images.The feature selection method employed was MERSS, which was applied to the extracted features.This approach resulted in an accuracy of 85.33%.When comparing it to NCA, MERSS has lower performance.Once again, when utilizing ResNet-18, we eventually have the best one.Based on the results, it can be concluded that MERSS performs well after implementing deep feature extractors.Additionally, it has been mentioned earlier that MERSS is superior to NCA.

Discussion
Some advantages and disadvantages are discussed in this section of the study.The following are the primary bene ts of the study: (i) Extensive comparisons are made using pre-trained architectures such as DarkNet-53, GoogleNet, NasNetMobile, InceptionResNet_v2, and ResNet-18.
(ii) To achieve con dential results, each architecture uses 5-fold cross-validation, and all results are measured using accuracy, sensitivity, G-Mean, F1-measure, kappa value, and AUC.
(iii) To improve classi cation performance on grapevine leaf images, pre-trained architectures are used as automatic deep feature extractors from the nal layers (pooling layers are used).This is critical because expert opinions are not required.
(iv) To reduce dimensions and select signi cant features, we propose novel sampling theory-based methods that ensure reliable study results.
(v) Finally, ANN is an excellent classi cation tool for detecting different types of grapevine leaf images.In addition, to evaluate the proposed methods, we compared them to NCA, a widely used feature selection method.Our proposed method outperforms NCA.
The following are the study's drawbacks: (i) Grapevine leaf images are restricted and only investigated as a balanced dataset.

Conclusion
Deep learning implementations are getting better and better today.This re ection can be seen with the naked eye and intensi es near vegetation.With this viewpoint, we have successfully extracted features from images of grapevine leaves in order to categorize the species of these leaves using pre-trained architectures.First, using 5-fold cross-validation, DarkNet-53, GoogleNet, InceptionResNet_v2, NasNetMobile, and ResNet-18 are used to directly classify images of grapevine leaves.The accuracy of DarkNet-53 in this section of the study is 96.20%.Although the results are excellent, we look into how they can be improved and offer new feature selection techniques based on sampling theory.Pre-trained architectures are used in the following section of this study as feature extractors, and their nal average pooling layer automatically incorporates features from images.All of these features, though, are not crucial details for images.We recommend SRS, RSS, ERSS, and MERSS as four feature selection methods to choose signi cant features.Additionally, using methodology, we determine how many features should be chosen.It allows for effective classi cation using a minimal set of features.Additionally, ANN is used to classify these features.In brief, the following outcomes are attained: DarkNet-53 is used as a deep feature extractor, and the MERSS feature selection method and ANN classi er yield a maximum accuracy of 97.33%.While GoogleNet is used as a deep feature extractor, the RSS feature selection method and ANN classi er are used to achieve the highest accuracy, which is 95.33%.Better accuracy is attained as 92.67% when InceptionResNet_v2 is used as a deep feature extractor, providing the RSS and MERSS feature selection methods and ANN classi er.The accuracy increases to 79.33% when NasNetMobile is used as a deep feature extractor in combination with the MERSS feature selection method and ANN classi er.As a deep feature extractor, ResNet-18 is implemented, but using the MERSS feature selection method and ANN classi er, accuracy is improved to 85.33%.After all, a comparison with NCA has been made, and we can state that our suggested methods, speci cally the MERSS method, are superior to it under comparable circumstances.As a result, the performance is effectively improved by the planned hybrid algorithms, and it can be inferred from explanatory methods that the results are reliable.Using DarkNet53-MERSS-ANN, the study's best performance is achieved with an accuracy of 97.33% when identifying different grapevine leaf types from images.Finally, we can a rm that the structure we created performs superbly.
What happens in the next study?
This study demonstrates how pre-trained architectures can identify different plant species from images.
Therefore, automatic species identi cation for experts, farmers, and researchers is provided by image classi cation of plants.However, because of the suggested algorithm, people are not spending a lot more time identifying plant species.We can say that the created structure is capable of performing and being advanced in plant detection.We'll use the suggested feature selection techniques on a variety of datasets from various elds in our upcoming work.

Funding
No funding was received for this study.Pipeline of this study.
Grapevine leaves classes.
al., 2022, Koklu et al., 2022, Atila et al., 2021).Tiwari et al. (2021) performed a deep learning-based system to detect plant diseases and classify various types.Besides, they implemented ve cross validations while training the dataset, which has 27 different classes.As a result, they obtained an average cross-validation accuracy of 99.58% and an average test accuracy of 99.199%.Ahila Priyadharshini et al. (2019) aimed to identify crop disease from maize leaf images via their proposed convolutional neural network (CNN).In fact, they modi ed LeNet and trained four different classes-three diseases and one health class-from the PlantVillage dataset.Azim et al. (2021) utilized decision trees, which are one of the machine learning algorithms, to detect three different rice leaf diseases from images.They manually extracted features from images such as color, shape, and texture.Lastly, their study achieved an accuracy of 86.58%.Sembiring et al. (2021) focused on detecting tomato leaf diseases, which are classi ed into nine different classes via CNNs architectures: Very Deep Convolutional Neural Networks (VGG), Shu eNet, and SqueezeNet.

Figures
Figures

Table 1
State-of-the-art studies on leaf images in the literature.

Table 2
(Shi et al., 2020) size of each architecture.Figure2displays classes of grapevine leaves.2.2 Deep Feature ExtractorsWith the implosive advance in data and the fast development of algorithms such as machine learning, deep learning, etc., arti cial intelligence (AI) has obtained novelties in a wide range of applications(Shi et al., 2020).Notably, researchers prefer deep learning algorithms to analyze images due to their ability to extract features(Koklu etal., 2022, Saberi Anari, 2022, Ozaltin et al., 2022, Ozaltin and Yeniay, 2022, Özaltın and Yeniay, 2021

Table 2
Properties of pre-trained architectures.
*: showing that NASNetMobile does not possess a linear sequence of modules.

Table 3
(Addagarla et al., 2020)8).Softmax layer.GoogLeNet is composed of a total of 22 layers, which are deeper in nature.It effectively employs activation layers using the Recti ed Linear Unit (ReLU) function.GoogleNet consists of a total of 7 million parameters.NASNetMobile architecture, developed byZoph et al. (2018), aims to explore optimal CNNs structures using reinforcement learning methods.The team from Google Brain, as presented by(Addagarla et al., 2020), has made signi cant advancements in the eld of Neural Architecture Search (NAS).While NAS architectures exhibit variations in their sizes, it is worth noting that NasNetMobile represents a scaleddown iteration.The parameter count of NASNetMobile is approximately 4.5 million.The accepted input image size is 224x224 pixels.
Szegedy et al. (2015)cture was proposed bySzegedy et al. (2015).The architecture exhibits a multitude of layers, including two convolutional layers, four max-pooling layers, nine inception layers, a global average pooling layer, a dropout layer, a linear layer, and a The (Djouzi et al., 2022, Rendon et al., 2020)andom sampling (SRS) is a very common sampling design used by many researchers because of practicality.To improve the precision new sampling designs are also suggested in literature.One of these designs is Ranked set sampling (RSS) was rst proposed byMcIntyre (1952)as an alternative the SRS.When we compare both sampling designs for the same sample size, we can say that RSS becomes more e cient than SRS as long as a more accurate and accessible ranking criterion is available for increasing grapevine leaves classi cation performance.For detail literature; kindly see Zamanzade and Mahdizadeh (2020), Bouza-Herrera and Al-Omari (2018), Koyuncu and Al-Omari (2021), etc.In big data literature, there are many important studies using sampling designs to reduce computational complexity, challenge imbalanced data and increase the precision(Djouzi et al., 2022, Rendon et al., 2020).In this study, following new ranked set sampling designs, we have proposed new procedure for feature selection weights.
2.3.2.Ranked Set Sampling (RSS) Procedure for Feature Selection Following McIntyre (1952), we can de ne a new procedure for feature selection as: 2.3.4Moving Extreme Ranked Set Sampling (MERSS) Procedure for Feature Selection Another modi cation of RSS, namely Moving Extreme Ranked Set Sampling (MERSS), was introduced by Al-Odat and Al-Saleh (2001).Following Al-Odat and Al-Saleh (2001), we have suggested following procedure to select feature weights: (Subasi, 2012btain reliable outcomes(Subasi, 2012, Lopez-Del Rio et al., 2019, Saber et al., 2021, Ozaltin et al., 2022).Moreover, this algorithm effectively mitigates the problem of over tting during data analysis Gao et al., 2018).Hence, the algorithm is denoted as encapsulating a black-box.The k-fold cross-validation method is commonly preferred by researchers

Table 4
Performance of pre-trained architectures on grapevine leaf images using 5-fold cross validation.

Table 5
Hybrid algorithm performance with DarkNet53, suggested methods, and ANN.

Table 6
Hybrid algorithm performance with GoogleNet, suggested methods, and ANN.Table 6 indicates that the GoogleNet-RSS-ANN algorithm, with a test accuracy of 95.33%, is the most accurate of the methods we have proposed.The values for the sensitivity, G-Mean, F-measure, kappa value, and AUC are 95.33%,97.07%, 0.9533, 0.8542, and 0.9946, respectively.

Table 7
Hybrid algorithm performance via NasNetMobile, suggested methods, and ANN.The one after that is InceptionResNet_v2.If features are obtained from it, a global average pooling layer known as 'avg-pool' is found to be useful.Essentially, 1536 features are extracted from the layer, with signi cant 53 features selected using the ERSS, RSS, and SRS methods.The MERSS method also identi es 55 essential features.Table8displays all of the results.

Table 8
Hybrid algorithm performance using InceptionResNet, suggested methods, and ANN.

Table 9
(Chandrashekar and Sahin, 2014sing ResNet-18, suggested methods, and ANN.In recent years, numerous researchers have employed various feature selection methods in their studies(Chandrashekar and Sahin, 2014, Ozaltin et al., 2022, Koklu et al., 2022).However, the process of identifying the suitable feature selection is not straightforward, as certain methods may rely on underlying assumptions.This study proposes several practical methods and conducts a comparison with Neighborhood Component Analysis (NCA), a non-parametric method that operates without making any assumptions.The results of applying various combinations of NCA are presented in

Table 10 Table 10
Performance of hybrid algorithm based on NCA and ANN combination.
The data set is available from the website: http://www.muratkoklu.com/datasets/Grapevine_Leaves_Image_Dataset.rarDeclarationsConict of interests Oznur Ozaltin and Nursel Koyuncu declare no competing interest.Oznur Ozaltin and Nursel Koyuncu declare that they have no known competing fnancial interests or personal relationships that could have appeared to infuence the work reported in this paper.Compliance with ethical standardsOur manuscript is original and has not been previously published, nor currently under review at any other journal.We have complied with all ethical guidelines for conducting research and have obtained all necessary approvals for the study.