A Study of Dimensionality Reduction in GLCM Feature-Based Classification of Machined Surface Images

Prasad, Ganesha; Gaddale, Vijay Srinivas; Kamath, Raghavendra Cholpadi; Shekaranaik, Vishwanatha Jampenahalli; Pai, Srinivasa Padubidri

doi:10.1007/s13369-023-07854-1

A Study of Dimensionality Reduction in GLCM Feature-Based Classification of Machined Surface Images

Research Article-Mechanical Engineering
Open access
Published: 12 May 2023

Volume 49, pages 1531–1553, (2024)
Cite this article

Download PDF

You have full access to this open access article

Arabian Journal for Science and Engineering Aims and scope Submit manuscript

A Study of Dimensionality Reduction in GLCM Feature-Based Classification of Machined Surface Images

Download PDF

Ganesha Prasad¹,
Vijay Srinivas Gaddale¹,
Raghavendra Cholpadi Kamath ORCID: orcid.org/0000-0002-2192-7046¹,
Vishwanatha Jampenahalli Shekaranaik² &
…
Srinivasa Padubidri Pai²

1470 Accesses
1 Citation
Explore all metrics

Abstract

The surfaces produced by the machining process are sensitive to the type of machining process and the conditions under which it is performed. Thus, surface texture identification is crucial in quality assurance, as it acts as a feedback to the machining process. Machined surface identification using image processing and machine learning (ML)-based techniques is gaining much importance due to industrial automation. This investigation addresses the development of ML models using gray-level co-occurrence matrices (GLCM) features to classify the machined (turned, ground and shaped) surfaces. The influence of distance-based dimensionality reduction techniques (DRT) viz., Fisher's criterion, Separation index and Bhattacharya distance on the performance of the ML-based image classifiers is explored. The GLCM features extracted from the machined surface images are used as inputs to ML classifiers. A threshold criterion function (TCF) is used to select the sensitive features in the DRT. Among all the classifiers, the (Random Forest) RAF model could produce a better classification accuracy as high as 95.3%. Also, analysis results show that the proposed dimensionality reduction methodology with TCF effectively identifies the most sensitive features. A maximum dimensionality reduction of 62% is achieved. The proposed methodology showed a 7.2% improvement in classification accuracy over the techniques reported in the previous study. Thus, developed ML models successfully classify the machined surface images with a minimum time and computational burden on the computer.

Survey on SVM and their application in image classification

Article 11 January 2018

Considerably Improving Clustering Algorithms Using UMAP Dimensionality Reduction Technique: A Comparative Study

Advances of Four Machine Learning Methods for Spatial Data Handling: a Review

Article 14 May 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The surfaces produced by machining methods will have various yet distinct textures. These textures are made up of repeated patterns with some attributes that vary regularly. Surface texture identification is used in classification (categorization) as well as in industrial inspection methods. Additionally, considering the engineering analysis stance, automatic identification of machining methods from the surface texture will help the designer in availing crucial production data pertaining to tools viz., geometry, kinematics and detection of machining anomaly. Categorization is the procedure of generating a decision process relying on a set of familiar cases (labeled data). Based on training data, the learning algorithm generates a function f: X$\to$Y (where X and Y are input and output, respectively) that maps new unrevealed data sets from the input to the output field [1].

Image processing and machine vision-based studies are gaining a lot of attention from both academia and industries in various fields viz., tool condition monitoring [2], surface defect identification [3], machine tool fault diagnosis [4], evaluation of tool wear and surface roughness [5] and chatter identification [6], etc. Machine vision-centered error prognosis and monitoring of the machine conditions are often data-directed practices largely used in unmanned machining and 3D printing. It ensures periodic maintenance and operations reliability in an industrial management setup. Because of the availability of low-cost instruments for data collection [7], as well as upcoming technologies in image processing, feature extraction [8, 9] and machine vision [10, 11], computer vision-centered condition monitoring has emerged as the best-suited method for on-machine surveillance. The surface texture identification through image processing and machine learning (ML)-based techniques has helped to monitor the machine tool and the machining process.

In the present age of big data characterized by enormous data sets and high dimensionality, extracting usable information from vast amounts of data is a big hurdle. High-dimensional data sets contain many irrelevant and redundant features, which makes data processing, knowledge mining, and pattern categorization more challenging. In such cases, the best strategy to solve the problem will be to apply dimensionality reduction techniques (DRT) to bring down data from the original high-dimensional space to the low-dimensional intrinsic space by filtering out noise and redundant information [12]. Implementation of DRT produces linear separability by mapping a lower and reduced dimensional feature space onto a higher dimensional feature space. Therefore, dimensionality reduction is a helpful, important, and necessary strategy for trimming, distinguishing, and exemplifying datasets by turning high-dimensional datasets into lower-dimensional datasets [13]. The amount of data needed to perform a credible analysis grows exponentially as the complexity of the data increases in ML. When it came to challenges in dynamic optimization, Bellman coined the phrase "curse of dimensionality" [14]. The issue can be better addressed by identifying the appropriate techniques to reduce the number of features taken during analysis. In this regard, the frequent methods employed are,

a.
Feature selection
b.
Feature extraction

Feature selection is a prominent method used in ML to reduce data dimensionality. It identifies significant features from the dataset and removes those that are redundant [15, 16]. The three strong points in favor of the feature selection technique are that it alleviates the bane of dimensionality and quickens the learning process, thereby reducing the repository space [17]. Conversely, the feature extraction technique typically relies on feature transformation, which facilitates projecting high-dimensional data into a low-dimensional subspace. The feature extraction technique helps in covering the potential structure of the original data without changing the actual relative distance between the extracted features, thereby avoiding any significant data loss [12]. Principal component analysis (PCA) is a commonly used DRT based on feature extraction.

The image textures are represented using texture features, which are extracted using texture operators. Gray level co-occurrence matrices (GLCM) are one among the various texture operators. GLCM is a second-order statistical texture feature used to identify the surface texture or microstructure. GLCM is an important tool in terms of ease of calculation and uses and also its ability to determine the coarseness, roughness and orientation in a single calculation [18, 19]. Further details of GLCM and its calculations are given in Sect. 3 of this article.

In this paper, it is proposed that GLCM-based texture features in analyzing the machined surfaces and classifying them using different ML-based models. In the present work, 21 GLCM-based image features are extracted defined for texture analysis. This approach uses the image pre-processing stage using various methods viz., grayscale conversion, resizing, contrast enhancement, histogram equalization, and partitioning of an image into sixteen divisions. The partitioning of an image aids in increasing the number of images and the generalization of the ML models. Further, we have investigated different distance-based feature selection methods to identify the most sensitive features used as input to the ML-based classifiers. Thus, the developed model effectively identifies and classifies machined surface images. The contributions of this paper are:

Synergize the performances of image processing, GLCM feature extraction, dimensionality feature reduction and ML-based classification to identify the surface texture of the machined surface images.
Selection of sensitive features from the images that form the most effective set of inputs to the ML classifiers, thereby reducing the burden of computational complexity.
Evaluate the effectiveness of the threshold criterion function (TCF) in the DRT.

Figure 1 shows the scheme of the customized program developed to achieve the objectives set in this paper. The various types of image pre-processing are implemented in MATLAB customized program. The dimensionality reduction is implemented using Microsoft Excel, and the ML-based classifiers are implemented in Python.

The rest of the paper organization is as follows: Sect. 2 provides a brief literature review on machine surface texture analysis and DRT. Section 3 gives a clear introduction to GLCM and feature extraction from GLCM. The distance-based feature selection methods and TCF used to select the most sensitive features are detailed in Sect. 4. The different ML models used for classification are discussed in Sect. 5. The image data set details and image pre-processing steps are presented in Sect. 6, and Sect. 7 provides the selection of sensitive features. Section 8 gives ML-based classification results. Finally, Sect. 9 is fully about the conclusions of this work.

2 Literature Survey

2.1 Machined Surface Texture Analysis

The texture is a recurring pattern consisting of a group of local statistics that vary slowly. On the other hand, image texture is a function of spatial variation in pixel intensities (grey values). Tuceryan and Jain [20] highlighted the usefulness of image texture which was being extensively employed in various other domains and furthermore extended to the field of machining to analyze machined surfaces.

In this regard, the past research work is summarized below. Dave et al. [21] performed the categorization of the machined surfaces by partitioning images into sixteen non-overlapping-equal sub-images. The wavelet-based features were extracted from the sub-images up to the fourth level of decomposition and fed to the logistic regression classifier. The classifier was able to categorize the machined surfaces with a training accuracy of 100% and a testing accuracy of 92.3%. Patel et al. [22] used GLCM-based features to classify the machined surface images. With tenfold cross-validation, artificial neural network (ANN) and random forest (RAF) models gave better classification results among the different classification models. Ashour et al. [1] used the PCA technique to reduce the histogram-based feature vectors of images that were later given as inputs to various classifiers to classify the machined surfaces. Compared to all other kernels for Support vector machine (SVM), including the ANN classifier, the SVM with linear kernel outperformed all. Manish et al. [23] used canny edge detection and histogram analysis-based image processing techniques to inspect the surface finish and surface flaws of the ground surface. Penumuru et al. [24] worked on identifying and classifying materials using ML and machine vision in the context of industry 4.0. The Red–green–blue (RGB) color components were used as an input feature to ML models. The SVM model was able to produce better results, and finally, the author recommended that the developed model could be executed in robots commissioned in an intelligent manufacturing setup. John and Arunachalam [25] suggested that illumination compensation was necessary to suppress the illumination variation to achieve consistency and accuracy in surface texture analysis. A similar study in [26] observed that measured and predicted surface roughness values correlated well under blue light illumination. ANN model was used for predicting the surface roughness values using the GLCM feature of 3D printed parts. Boudiaf et al. [27] worked on surface defect detection of steel strips using deep learning and SVM. The proposed model was able to identify the defect with an accuracy of 99.70%. Summary of above literatures is provided in Table 1.

Table 1 Summary of literatures on surface texture identification

Full size table

2.2 Dimensional Reduction Techniques (DRT)

Distance-based feature selection methods use a statistical approach to identify the most important feature and thus, discriminate between the most sensitive and no sensitive features for classifications [28, 29]. The past literature reports many distance-based statistical measures to quantify the amount of dissimilarity between the two probability distributions.

This section reviews the literature on distance-based DRT. To track the manufacturing process, Shao et al. [30] developed a novel feature selection method that combines forward feature selection with k-fold cross-validation and Fisher's discriminant ratio (FDR). The proposed method successfully picked the most relevant features and statistical process control limits. Chen et al. [31] applied Fisher’s discriminant analysis (FDA) to identify the most sensitive features for chatter detection from multi-channel vibration signals. Thus, suggested feature ranking is significant, which increases the accuracy of computational models. Sun et al. [32] suggested a feature selection technique wherein Fisher's ratio is integrated with the novel clustering overlap while classifying the ultrasonically welded carbon fiber reinforced polymer composites based on the weld attributes, joint performance and welding process parameters. The suggested method produced better classification results than those obtained using linear discriminant analysis or Fisher's ratio alone. On the other hand, Shao et al. [33] used Fisher's ratio to rank the features extracted from space and frequency domains of cross-sectional profiles on tool surfaces. Further, the ranked features were classified using a quadratic classifier and were used to monitor the tool wear in ultrasonic welding of lithium-ion batteries. In another instance, Bhat et al. [34] used Fisher's ratio to pick the prominent characteristics from a huge set of features extracted using GLCM and those characteristics were used to classify the tool wear state using the hidden Markov model. The suggested model was worthwhile, with an average accuracy of 95%. Vijay et al. [35] differentiated the DRT like PCA, Fisher's criterion (FC) and separation index (SI) using the radial basis function neural network for bearing condition monitoring. The vibration signals generated from the bearings under different conditions were denoised, and time-domain and frequency-domain-based features were extracted. The FC and SI techniques successfully improved the performance of the developed model by reducing the dimensionality of the feature vector to 1/6th. Zhu et al. [36] classified the tropical forest from multi-temporal sentinel-2 images in Hainan Island. The feature selection method using weighted separation-index and classification using RAF with an average accuracy of 93% shows the usefulness of the developed technique.

Momenzadeh et al. [37] evaluated the different feature selection techniques, including Bhattacharyya distance, to classify the gene expression data. The authors concluded that effectiveness in the feature selection method resulted in robust classification from the hidden Markov model. Jiang and Zhu [38] proposed a dynamic Bhattacharya bound-based feature selection technique for fault classification. The proposed method showed better performance compared to FDA, dynamic FDA and locality-preserving FDA. Shah and Dang [39] worked on an automatic modulation classification using the Bhattacharya distance (BD)-based feature selection method and ANN classifier. The authors reported that the proposed feature selection method for three different neural network architectures showed superior performance with reduced computational complexity when compared to PCA. Sivakkolundu and V. Kavitha [40] proposed a Bhattacharya coefficient-based feature selection technique for handling huge data from the warehouse. BD-based attributes selection method efficiently dealt with the big data with a minimal false positive rate and time consumption. Buchaiah and Shakya [41] used BD combined with SVM for fault diagnosis of bearing using 2D fault features extracted from vibration signals. The proposed methodology was able to classify the single and combined bearing defects successfully and recommended to use of the same methodology in fault monitoring of shafts, gear motors, etc. Table 2 provides the summary of above cited literatures.

Table 2 Summary of the literature on distance-based DRT

Full size table

3 GLCM

The GLCM-based technique for image texture analysis was put forth by Haralick and group [42] way back in 1973. Each element of the GLCM matrix represents the number of co-occurrences of corresponding pixel pairs in a particular spacing (s) and direction $\left( \theta \right)$ in an image matrix, and it can be written as (Eq. 1):

$${\text{GLCM}}\left( {i,j} \right)_{\theta } = \left| {\left\{ {\left( {\left( {p_{1} ,p_{2} } \right)|I\left( {p_{1} } \right)} \right) = i,\;I\left( {p_{2} } \right) = j} \right\}} \right|$$

(1)

where p₁ and p₂ represent the position of pixels in the image matrix M × N as shown in Fig. 2a and the construction of the GLCM matrix from the image matrix is depicted in Fig. 2b.

The image matrix (Fig. 2a) consists of a set of pixel values, which indicates the intensity of the gray level present in an image. In the image matrix, there is a co-occurrence of intensity values, viz., eight and seven two times along a 0⁰ direction with a spacing of one (highlighted with an ellipse, Fig. 2a). Thus, value two in the GLCM matrix (highlighted with a circle, Fig. 2b) along the third row and the second column indicates the pixel value of eight, which occurred two times along with pixel value seven with pixel spacing is one and pixel pair direction 0⁰. In the GLCM matrix, i and j denote the gray level values of an image pixel. Similarly, the GLCM matrix is constructed [43, 44]. Using the GLCM matrix, twenty-one texture features can be extracted and are given in Table 3. These features are used by many researchers [42, 45, 46] for different applications, and the same can be referred to for more information. where P(i, j) is (i, j)th entry in a normalized GLCM matrix, µ_x and µ_y are means of row and column of GLCM matrix further $\sigma_{x}$ and $\sigma_{y}$ are the standard deviations of row and column of the GLCM matrix and G is the number of distinct grey levels in the quantized image matrix.

Table 3 GLCM features

Full size table

4 Feature Selection using Distance-Based Methods

The computation speed and generalization nature of the ML classification model are directly proportional to the possibility of using feature vector space in lower dimensions. The extracted feature matrix might contain correlated features that could form similar or repeated inputs to the ML classifiers for the categorization of machined surface images considered in the present research. Conversely, unique feature input contains features that are uncorrelated. Also, some of the features might not form sensitive inputs to the classifier. A feature is said to be sensitive if the average distance between the feature classes is large and at the same time, the average distance within the class should be less. Thus, it is essential to select uncorrelated sensitive features that would fulfill the requirement of a compact classifier viz., the dimensionality of classifier input would be reduced. Figure 3 shows the difference between ‘class separability’ and ‘classification separability.’ The eigenvalues along principal axis 2 are larger than the principal axis 1. However, when the classes are mapped over axis 2, the probability density functions are overlapped to a larger extent. A smaller distance between the mean lines and larger standard deviation makes the feature measurement along axis 2 tedious for classification purposes. However, suppose the classes are mapped on axis 1; in that case, the probability density functions are well separated with a larger distance between their means and smaller standard deviations, making feature measurement most suitable for classification purposes. Thus, the distance measure is necessary to acquire a quantitative assessment of the two classes separability. In that regard, details about three distance-based feature selection methods are explained in below subsections.

4.1 Fisher's Criterion (FC)

The FC criterion relies on calculating the separation distance between the two categories of interest [48]. Further, the separation distance depends on the mean and standard deviation of two categories of interest and is mathematically expressed in Eq. 2, as given by Yen and Lin [49].

$$F_{k}^{a,b} = \frac{{\left| {Mean\left( {t_{k}^{a} } \right) - Mean\left( {t_{k}^{b} } \right)} \right|^{2} }}{{\left[ {std\left( {t_{k}^{a} } \right)} \right]^{2} + \left[ {std\left( {t_{k}^{b} } \right)} \right]^{2} }}$$

(2)

where $F_{k}^{a,b}$ represents the Fisher's separation distance between the two categories a and b for the kth feature (a and b each may be turned, ground and shaped surface). Mean () and std () represent the mean and standard deviation of the GLCM feature vector. Equation 2 is applicable for a two-class problem. For a three-class problem (turned, ground and shaped), Fisher's discriminant power (FDP) for the kth feature is expressed as Eq. 3.

$$FDP_{k} = F_{k}^{t,g} + F_{k}^{g,s} + F_{k}^{t,s}$$

(3)

where $F_{k}^{t,g}$ $F_{k}^{g,s} and$ $F_{k}^{t,s}$ are the Fisher’s separation distance between turned-ground, ground-shaped and turned-shaped classes, respectively; and t, g and s represent the turned, ground and shaped surfaces.

4.2 Separation Index (SI)

It is the ratio of the difference between the average of two classes to the sum of the standard deviation of two classes [50]. SI is a non-dimensional metric that presents the graphical difference between two class groupings. It is mathematically expressed as shown in Eq. 4

$$S_{k}^{a,b} = \frac{{\left| {Mean\left( {t_{k}^{a} } \right) - Mean\left( {t_{k}^{b} } \right)} \right|}}{{std\left( {t_{k}^{a} } \right) + std\left( {t_{k}^{b} } \right)}}$$

(4)

where $S_{k}^{a,b}$ gives the SI between the two classes or groupings, say, two groupings a and b for the kth characteristic (a and b each may be turned, ground and shaped surface). All other terms remain the same as above. Kim et al. [50] highlighted that a better characteristic is one that has a higher difference in average values and lower standard deviation values between the classes of interest. SI discriminant power (SIDP) for a specific feature k is given by Eq. 5.

$$SIDP_{k} = S_{k}^{t,g} + S_{k}^{g,s} + S_{k}^{t,s}$$

(5)

where $SIDP_{k}$ is discriminant power for a specific feature k; $S_{k}^{t,g}$, $S_{k}^{g,s}$ and $S_{k}^{t,s}$ are the separation distances between turned-ground, ground-shaped and turned-shaped classes, respectively.

4.3 Bhattacharya Distance (BD)

It is a measure of similarity between two continuous or discrete probability distributions. The BD between two classes for the kth feature can be determined from Eq. 6 [16].

$$b_{k}^{a,b} = \frac{1}{4}\ln \left( {\frac{1}{4}\left( {\left( {\frac{{\sigma_{k}^{a} }}{{\sigma_{k}^{b} }}} \right)^{2} + \left( {\frac{{\sigma_{k}^{b} }}{{\sigma_{k}^{a} }}} \right)^{2} + 2} \right)} \right) + \frac{1}{4}\left( {\frac{{\left( {\mu_{k}^{a} - \mu_{k}^{b} } \right)^{2} }}{{\left( {\sigma_{k}^{a} } \right)^{2} + \left( {\sigma_{k}^{b} } \right)^{2} }}} \right)$$

(6)

where $\mu_{k}^{a}$, $\mu_{k}^{b}$ and $\sigma_{k}^{a}$, $\sigma_{k}^{b}$ are the means and standard deviations for two classes a and b, respectively. The term $b_{k}^{a,b}$ represents the BD for a specific feature k between the two machined surfaces a and b. For a three class problem, the Bhattacharya distance discriminant power (BDDP) for the kth feature is given by Eq. 7.

$$BDDP_{k} = b_{k}^{t,g} + b_{k}^{g,s} + b_{k}^{t,s}$$

(7)

where $b_{k}^{t,g}$, $b_{k}^{g,s}$ and $b_{k}^{t,s}$ are the BD distances between turned-ground, ground-shaped and turned-shaped classes, respectively, for the kth feature. The features with higher FDPs, SIDPs and BDDPs form sensitive features to classifiers.

4.4 Threshold Criteria Function (TCF)

It was proposed by [51] and used for selecting the most sensitive features from different methods (FC, SI and BD) and is expressed analytically in Eq. 8.

$$\theta = \frac{{\mathop \sum \nolimits_{k = 1}^{s} G_{k}^{*} }}{{\mathop \sum \nolimits_{k = 1}^{f} G_{k} }}$$

(8)

where G_k is the discriminant power for kth feature (k = 1 to f) and G* is the vector in which the elements of G are stored in the descending order, $\theta$ is the threshold, s is the number of sensitive features and f is the total number of features (s < f). The selected value of $\theta$ can vary in the range from 0.75 to 0.9. If $\theta$ = 0.85, it means that the cumulative sum of discriminant power of s selected features is approximately equal to 85% of the cumulative sum of discriminant power of all f features. As a result, instead of all f features, only the selected s sensitive features should be given as input to the classifier, viz., a dimensionality reduction from f to s is achieved [51]. The methodology of selecting the sensitive features using TCF is given in algorithm 1.

5 Machine Learning (ML) Models

The features selected using the three distance-based feature selection methods are fed to the four ML classifiers. The four classifiers details are given below;

5.1 Decision Tree (DCT)

A DCT is used to examine the values of a few of the object's characteristics to establish its classification (variables). It mainly consists of the root node, decision node and leaf node. The DCT begins at the root node, and it represents the complete dataset, which will be further split into two or more homogeneous sets. The decision node/intermediate nodes are test nodes for the objects' characteristics. Leaf node/final nodes/leaves are decision or classification nodes in a DCT. The development of DCT is a cyclical process. The frequency with which an attribute appears in a tree indicates its importance. The attribute selection at the nodes depends on the highest information gained by that particular attribute. Thus, an attribute with higher gain or lesser entropy will be selected as the tree's root node. Hierarchically further segmentation is performed usually in a binary manner until the final node is reached [52].

5.2 K-Nearest Neighbor (KNN)

It is one of the most basic supervised learning-based ML algorithms. The KNN algorithm identifies similarities between the latest and existing data and assigns the latest case to the group that is most similar to the available groupings. In other words, when the latest data arrive, it can be easily classified into a well-suited group by applying the KNN. Even though the KNN can be applied for solving both regression and classification problems, but most often, it is used for solving classification problems. There are no assumptions made about the underlying data due to the nonparametric nature of KNN. Usually, KNN does not act on the data set once available, instead saves the data and acts on the data only during the classification process [53, 54].

5.3 Random Forest (RAF)

A reliable tree-based classification method acts as an ensemble tool for constructing a DCT from a subset of observations and variables belonging to a data set. Each DCT in a forest is used to judge unmarked samples, and the majority voting results of all DCT are used to predict the unmarked sample categories. The vote given by each tree carries the same weightage. The more trees in the forest, the greater the accuracy and the less chances of overfitting [22, 36]. The classification can be further improved by including many DCT in RAF from bagged datasets. The goal of bagging is to reduce variance in a dataset while avoiding over-fitting.

5.4 Support Vector Machine (SVM)

It is a supervised learning method capable of solving direct and indirect double-class binary categorization problems. In n-dimensional space, independent data x_i (i = 1…k) belong to the first or second class, and the data labels indicate -1 for the first-class and + 1 for the second class. The main purpose of SVM is to segregate the class data through a maximal margin hyperplane. In other words, the SVM undertakes to augment the length between data closest to the separation plane. The separation hyperplane for training the data set can be expressed mathematically as given in Eq. 9, provided independent data are segregated linearly [55].

$$f\left( x \right) = w^{T} x + b$$

(9)

where w represents the n-dimensional weight vector, b represents a scalar multiplier or bias value and x is an input feature vector. This equation determines the maximum margin that can be used to distinguish positive from negative classes. The trained data obtained from Eq. 10 are classified based on margin and later tested using the decision function shown in Eq. 10.

$$y_{i} \left( {wx_{i} + b} \right) \ge 1 i = 1 \ldots k$$

(10)

6 Image Dataset and Pre-Processing

To assess the performance of the ML techniques in the classification of machined surface images and to check the feasibility of different distance-based DRT, the dataset consisting of images of the turned, ground, and shaped surfaces was considered for classification purposes. Each machined surface set consists of eighteen images. Hence, there were three classes of machined surfaces that are needed to be classified. The dataset was provided by the Department of Mechanical Engineering, NMAMIT, Nitte. More information about the image capturing system is given in [56]. The steps followed in image pre-processing are depicted in Fig. 4. Initially, eighteen images from each of the three classes were transformed from RGB to grayscale to simplify the classification process. The resulting grayscale images were resized from 400 × 500-pixel resolution to 500 × 500-pixel resolution. Then, the contrast enhancement and histogram equalization methods were applied to correctly identify and analyze the surface texture. Then, to generalize the classification model, each histogram equalized image was partitioned into sixteen non-overlapping regions as suggested in [57, 58]. This produced 288 (18 × 16 = 288) images for each class and 864 (288 × 3 = 864) images for three classes.

7 Selection of Sensitive Features

From each of the partitioned images, the 21 GLCM features (depicted in Table 3) are extracted. The feature vectors are stored in a matrix of size (864 × 21). The DRTs viz., FC, SI and BD are used to select the sensitive features.

7.1 Selection of Sensitive Features using Fischer’s Criterion

The Fisher's separation distance between turned-ground ($F_{k}^{t,g}$), ground-shaped ($F_{k}^{g,s}$) and turned-shaped ($F_{k}^{t,s}$) classes is computed using Eq. 2. The FDPs for 21 features are computed using Eq. 3. Figure 5 shows the FDPs for 21 GLCM features. It can be observed from Fig. 5 that FDPs of the features with label numbers 21, 1, 15, 6, 16, 20, 9, 19, 17 and 8 are relatively higher when compared to other features (features are numbered as per Table 3). These features are the ones with better classification separability and are suitable input candidates for the ML classifiers (as discussed in Sect. 4). Table 4 shows the use of three different threshold values, viz., $\theta =$ 0.8, 0.85 and 0.9, in selecting the sensitive features under FC. The sensitive features are selected using TCF as per algorithm 1.

Table 4 Features selection for different threshold values based on FC

Full size table

With $\theta =$ 0.8, the cumulative FDPs of the eight (s = 8) features 21, 1, 15, 6, 16, 20, 9 and 19 amount to 117.71, which is approximately 80% ((117.71 ÷ 147.48) × 100 = 79.8%) of the sum of the FDPs of all the 21 features. Thus, the dimensionality is reduced from 21 to 8, with a dimensionality reduction of 62% (((21–8) ÷ 21) × 100 = 61.9%). Thus, for $\theta =$ 0.8, the DRT FC selects 8 sensitive features.

With $\theta =$ 0.85, the cumulative FDPs of nine (s = 9) features 21, 1, 15, 6, 16, 20, 9, 19 and 17 amount to 126.81, which is approximately 85% ((126.81 ÷ 147.48) × 100 = 85.9%) of the sum of the FDPs of all the 21 features. Thus, the dimensionality is reduced from 21 to 9, with a dimensionality reduction of 57% (((21–9) ÷ 21) × 100 = 57.1%). Thus, for $\theta =$ 0.85, the DRT FC selects 9 sensitive features.

Similarly with $\theta =$ 0.9, the cumulative FDPs of the ten (s = 10) features 21, 1, 15, 6, 16, 20, 9, 19, 17 and 8 amount to 132.53, which is approximately 90% ((132.53 ÷ 147.48) × 100 = 89.6%) of the sum of the FDPs of all the 21 features. Thus, the dimensionality is reduced from 21 to 10, with a dimensionality reduction of 52% (((21–10) ÷ 21) × 100 = 52.4%). Thus, for $\theta =$ 0.9, the DRT FC selects 10 sensitive features.

7.2 Selection of Sensitive Features using Separation Index

Using the similar procedure explained in Sect. 7.1, the SI distances between the three classes of the machined surfaces are computed using Eq. 4 and the SIDP is computed using Eq. 5.

Figure 6 shows the SIDP for the 21 features, and Table 5 shows the details of selecting the sensitive features for different threshold values, viz., $\theta =$ 0.8, 0.85 and 0.9. With θ = 0.8, ten sensitive features (s = 10) out of 21 (21, 1, 15, 16, 6, 20, 9, 19, 17 and 8) are selected resulting in a dimensionality reduction of 52.4%. With θ = 0.85, twelve sensitive features (s = 12) out of 21 (21, 1, 15, 16, 6, 20, 9, 19, 17, 8, 3 and 18) are selected resulting in a dimensionality reduction of 42.85%. With θ = 0.9, thirteen sensitive features (s = 13) out of 21 (21, 1, 15, 16, 6, 20, 9, 19, 17, 8, 3, 18 and 14) are selected resulting in a dimensionality reduction of 38.1%.

Table 5 Features selection for different threshold values based on SI

Full size table

7.3 Selection of Sensitive Features using Bhattacharya Distance

Using the similar procedure explained in Sect. 7.1, the BDs between the three classes of the machined surfaces are computed using Eq. 6 and the BDDP is computed using Eq. 7.

Figure 7 shows the BDDP for the 21 features, and Table 6 shows the details of selecting the sensitive features for different threshold values, viz., $\theta =$ 0.8, 0.85 and 0.9. With θ = 0.8, ten sensitive features (s = 10) out of 21 (21, 1, 15, 16, 6, 20, 9, 19, 17 and 8) are selected resulting in a dimensionality reduction of 52.4%. With θ = 0.85, eleven sensitive features (s = 11) out of 21 (21, 1, 15, 16, 6, 20, 9, 19, 17, 8 and 7) are selected resulting in a dimensionality reduction of 47.6%. With θ = 0.9, thirteen sensitive features (s = 13) out of 21 (21, 1, 15, 16, 6, 20, 9, 19, 17, 8, 7, 14 and 3) are selected resulting in a dimensionality reduction of 38.1%.

Table 6 Features selection for different threshold values based on BD

Full size table

7.4 Selection of Sensitive Features Common to FC, SI and BD

By comparing the sensitive features selected using the three DRTs, the common feature set corresponding to each threshold is tabulated in Table 7. The percentage reduction in the dimensionality is also depicted in the table.

Table 7 The common features selected for three threshold values for all three methods

Full size table

8 ML-Based Classification

The sensitive common features selected using the three different DRT (Table 7) are given as inputs to the ML classifiers. The data available is divided into two parts: 80% of the data is used for training, and the remaining 20% of data is used for testing. In this work, the classifiers are implemented using Python with the Jupyter Notebook editor on an ASUS laptop (AMD Ryzen 7 with 2.30 GHz processor, 8 GB RAM). The performance of the ML models depends on the tuning of the hyperparameters. Before the learning process can begin, several hyperparameters must be pre-configured. The Grid Search technique is used to tune the hyperparameters of the ML classifiers. Table 8 shows the hyperparameters selected for different ML classifiers and TCFs. The K-fold cross-validation technique is used to determine the performance of the ML classifier. After several trials, the number of folds is fixed as 10. In order to compare the effectiveness of the DRTs, the performance of the ML classifiers with all 21 features is compared against those with the selected sensitive feature set.

Table 8 The hyperparameters selected for different ML classifiers and TCFs

Full size table

The details of the confusion matrices obtained for different scenarios are provided in Appendix A. The diagonal elements in the confusion matrix indicate the number of correct classifications, whereas the other elements portray the number of miss classifications. Also, the numbers 0, 1 and 2 along the axis of the confusion matrix denote the turned, ground and shaped surfaces, respectively. With all the 21 features as input to the ML classifiers, the confusion matrices obtained by DCT, KNN, RAF and SVM are provided in Fig. 9 (a), (b), (c) and (d), respectively. With the common features corresponding to θ = 0.8 as input, the confusion matrices obtained by DCT, KNN, RAF and SVM are provided in Fig. 10 (a), (b), (c) and (d), respectively. With the common features corresponding to θ = 0.85 as input, the confusion matrices obtained by DCT, KNN, RAF and SVM are provided in Fig. 11 (a), (b), (c) and (d), respectively. With the common features corresponding to θ = 0.9 as input, the confusion matrices obtained by DCT, KNN, RAF and SVM are provided in Fig. 12 (a), (b), (c) and (d), respectively.

Table 9 shows the classification accuracies of the training and the testing phases of the ML models using common features and also using all 21 features. Figure 8 represents the testing accuracies using different classification models. From Fig. 8, it can be seen that the use of the DRT is beneficial in the classification to enhance the test accuracy, except for SVM. During training, DCT, KNN and RAF models can identify and classify the surfaces with 100% accuracy. However, during testing, the classification accuracy decreases a little bit. DCT model shows a decrease in classification accuracy with an increase in feature set. For θ = 0.8, highest classification accuracy of 93% is noted for DCT when compared to other threshold values. The KNN model also shows a classification accuracy of 93% for θ = 0.8 and 0.9. Compared to all the classifiers, the RAF model performs better with the highest testing accuracy of 95.3% for a threshold of 0.9. However, the lowest testing accuracy of 90.75% is obtained for the kNN classifier with all the features. The SVM classifier shows a classification accuracy of 93.34% for all the features as input and for θ = 0.8. However, at θ = 0.85 and 0.9, it reduces to 93%. Compared to all the classifiers, the overall performance of the DCT classifier is found to be the least, whereas the overall performance of the RAF model is better. For the same image set, the wavelet-based machined surface recognition yielded a classification accuracy of 88.89% [56]. Thus, a 7.2% improvement can be observed over the previous study. Therefore, the proposed methodology of the DRT with the TCF and the RAF classifier proves to be better in the classification of machined surfaces.

Table 9 Classification accuracy of ML models using the common features

Full size table

From Table 7, it is clear that the dimensionality reduction of 62%, 57% and 52% for threshold values of $\theta =$ 0.8, 0.85 and 0.9, respectively, shows a significant reduction in the computation burden on the computer. The time required for training the ML classifiers using a reduced feature set and all 21 features is given in Table 10. It is clearly noticeable that the time required for training the ML classifiers decreases with the number of features given as input. Thus, the proposed dimensionally reduction methodology effectively selects the reduced set of most sensitive features to be given as input to the ML classifier, reducing the computational burden and time.

Table 10 Time for ML model development

Full size table

Thus, the effective utilization of GLCM-dependent image characteristics in the texture analysis of the machined surface using machine vision technology is evident. Additionally, the sensitive feature selection or dimensionality reduction in the feature vector helps in improving the accuracy of ML models by removing the non-sensitive or repeated features. The proposed methodology can be verified for different DRT (other than those used in the present study) and will be considered as a scope for future work. Also, the effectiveness of the proposed methodology for other feature extraction techniques will be considered as future work. Deep learning networks are data-hungry systems. Because of the limited data set, deep learning models are not tried. However, the availability of a larger set of machined surface images can be used as input and deep learning networks can be used as classifiers. The feature set may be enhanced by considering newer image feature extraction techniques. The classifier's accuracy can be further improved by training the classifier with more images. Additionally, the proposed methodology can be further extended in robotics to identify materials by analyzing the surface textures. This will benefit the manufacturing industries to move toward the concept of industry 4.0.

9 Conclusions

This study proposes a methodology to classify the machined surface images using GLCM-based image features. This is accomplished by using image processing and ML techniques. The images of the turned ground and shaped machined surfaces were subjected to gray-level conversion, contrast enhancement, and histogram equalization. To address the generalization of inputs to the ML classifiers, the processed images were divided into sixteen evenly spaced areas. Further distance-based DRT viz, FC, SI and BD, along with TCF, were adopted to identify and select the most sensitive GLCM features. Most common features from FC, SI and BD for three different threshold values of 0.8, 0.85 and 0.9 were used as inputs to the ML classifiers (DCT, KNN, RAF, SVM). Among all the classifiers, the RAF model was able to classify with an accuracy of 95.3%, which is validated with tenfold cross-validation technique. The effectiveness of ML models in classifying the machined surfaces was found to be in the order of RAF, SVM, KNN and DCT. Compared to the methodology described in a prior study, the proposed methodology showed a 7.2% improvement in classification accuracy. The overall findings show that, even with a significant reduction in the number of features, the classification accuracies for the training and testing the ML classifiers with reduced feature sets are superior to that with all extracted features. Thus, it can be concluded that GLCM-based image features along with the proposed dimensionality reduction method, can be effectively used to identify and classify the machined surface images.

Abbreviations

ANN:: Artificial neural network
BD:: Bhattacharya distance
BDDP:: Bhattacharya distance discriminant power
DCT:: Decision tree
DRT:: Dimensionality reduction techniques
FC:: Fisher's criterion
FDA:: Fisher's discriminant analysis
FDR:: Fisher's discriminant ratio
FDP:: Fisher's discriminant power
GLCM:: Gray-level co-occurrence matrix
KNN:: K-Nearest neighbor
ML:: Machine learning
PCA:: Principal component analysis
RAF:: Random forest
RGB:: Red–green–blue
SI:: Separation index
SIDP:: Separation index discriminant power
SVM:: Support vector machine
TCF:: Threshold criteria function

References

Ashour, M.W.; Khalid, F.; Halin, A.A.; Abdullah, L.N.: Machining process classification using PCA reduced histogram features and the support vector machine. Int. Conf. Signal Image Process. Appl. 3(4), 2 (2015). https://doi.org/10.1109/ICSIPA.2015.7412226
Article Google Scholar
Arellano, G.M.; Terrazas, G.; Ratchev, S.: Tool wear classification using time series imaging and deep learning. Int. J. Adv. Manuf. Technol. 104, 3647–3662 (2019). https://doi.org/10.1007/s00170-019-04090-6
Article Google Scholar
Abhilash, P.M.; Chakradhar, D.: Image processing algorithm for detection, quantification and classification of microdefects in wire electric discharge machined precision finish cut surfaces. J. Micromanuf. (2021). https://doi.org/10.1177/25165984211015410
Article Google Scholar
Liu, Y.; Guo, L.; Gao, H.; You, Z.; Ye, Y.; Zhang, B.: Machine vision based condition monitoring and fault diagnosis of machine tools using information from machined surface texture: A review. Mech. Syst. Signal Process. (2022). https://doi.org/10.1016/j.ymssp.2021.108068
Article Google Scholar
Mikołajczyk, T.; Nowicki, K.; Bustillo, A.; Pimenov, D.Y.: Predicting tool life in turning operations using neural networks and image processing. Mech. Syst. Signal Process. 104, 503–513 (2018). https://doi.org/10.1016/j.ymssp.2017.11.022
Article ADS Google Scholar
Lei, N.; Soshi, M.: Vision-based system for chatter identification and process optimization in high-speed milling. Int. J. Adv. Manuf. Technol. 89(9–12), 2757–2769 (2017). https://doi.org/10.1007/s00170-016-9770-2
Article Google Scholar
Garcia-Ordas, M.T.; Alegre-Gutierrez, E.; Alaiz-Rodriguez, R.; Gonzalez-Castro, V.: Tool wear monitoring using an online, automatic and low cost system based on local texture. Mech. Syst. Signal Process. 112, 98–112 (2018). https://doi.org/10.1016/j.ymssp.2018.04.035
Article ADS Google Scholar
Abdulhussain, S.H.; Mahmmod, B.M.; Flusser, J.; Al-Utaibi, K.; Sait, S.M.: Fast overlapping block processing algorithm for feature extraction. Symmetry (Basel). (2022). https://doi.org/10.3390/sym14040715
Article PubMed Central Google Scholar
Mahmmod, B.M.; Abdulhussain, S.H.; Suk, T.; Hussain, A.: Fast computation of Hahn polynomials for high order moments. IEEE Access 10, 48719–48732 (2022). https://doi.org/10.1109/ACCESS.2022.3170893
Article Google Scholar
Abdulhussain, S.H.; Mahmmod, B.M.; Baker, T.; Al-Jumeily, D.: Fast and accurate computation of high-order Tchebichef polynomials. Concurr. Comput. Pr. Exper 34, 1–16 (2022). https://doi.org/10.1002/cpe.7311
Article Google Scholar
Everingham, M.; Van Gool, L.; Williams, C.K.I.; Winn, J.; Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010). https://doi.org/10.1007/s11263-009-0275-4
Article Google Scholar
Li, M.; Wang, H.; Yang, L.; Liang, Y.; Shang, Z.; Wan, H.: Fast hybrid dimensionality reduction method for classification based on feature selection and grouped feature extraction. Exp. Syst. Appl. 150, 113277 (2020). https://doi.org/10.1016/j.eswa.2020.113277
Article Google Scholar
Arowolo, M.O.; Adebiyi, M.O.; Aremu, C.; Adebiyi, A.A.: A survey of dimension reduction and classification methods for RNA-Seq data on malaria vector. J. Big Data 8(1), 5–17 (2021). https://doi.org/10.1186/s40537-021-00441-x
Article Google Scholar
Bellman, R.: Dynamic programming. Princeton University Press Princeton, New Jersey (1957)
Google Scholar
Jain, D.; Singh, V.: An efficient hybrid feature selection model for dimensionality reduction. Proc. Comput. Sci. 132, 333–341 (2018). https://doi.org/10.1016/j.procs.2018.05.188
Article Google Scholar
Reyes-aldasoro, C.C.; Bhalerao, A.: The Bhattacharyya space for feature selection and its application to texture segmentation. Pattern Recognit. 39, 812–826 (2006). https://doi.org/10.1016/j.patcog.2005.12.003
Article ADS Google Scholar
Di Mauro, M.; Galatro, G.; Fortino, G.; Liotta, A.: Supervised feature selection techniques in network intrusion detection: A critical review. Eng. Appl. Artif. Intell. (2021). https://doi.org/10.1016/j.engappai.2021.104216
Article Google Scholar
Ashour, M.W.; Khalid, F.; Abdul Halin, A.; Abdullah, L.N.; Darwish, S.H.: Surface defects classification of hot-rolled steel strips using multi-directional shearlet features. Arab. J. Sci. Eng. 44(4), 2925–2932 (2019). https://doi.org/10.1007/s13369-018-3329-5
Article CAS Google Scholar
Araar, I.E.; Benammar, A.; Drai, R.; Benyahia, A.: A GLCM-based approach for the clustering of weld joint images. Fifth Int. Conf. Intell. Comput. Data Sci. pp. 1–9, (2021), https://doi.org/10.1109/icds53782.2021.9626710
Tuceryan, M.; Jain, A.K.: Texture analysis. World Scientific Publishing, The Handbo (1998)
Google Scholar
Dave, N., Vakharia, V., Kagathara, U., and Kiran, M.: BFeature extraction and classification from texture image of machined surfaces using multilevel wavelet decomposition and logistic regression. Lecture Notes in Mechanical Engineering, Springer Singapore, pp. 351–359, (2020)
Patel, D.R.; Vakharia, V.; Kiran, M.B.: Texture classification of machined surfaces using image processing and machine learning techniques. FME Trans. 47(4), 865–872 (2019). https://doi.org/10.5937/fmet1904865P
Article Google Scholar
Manish, R.; Venkatesh, A.; Denis Ashok, S.: Machine vision based image processing techniques for surface finish and defect inspection in a grinding proces. Mater. Today Proc. (2018). https://doi.org/10.1016/j.matpr.2018.02.263
Article Google Scholar
Penumuru, D.P.; Muthuswamy, S.; Karumbu, P.: Identification and classification of materials using machine vision and machine learning in the context of industry 4.0. J. Intell. Manuf. 31(5), 1229–1241 (2020). https://doi.org/10.1007/s10845-019-01508-6
Article Google Scholar
John, J.G.; Arunachalam, N.: Illumination Compensated images for surface roughness evaluation using machine vision in grinding process. Proc. Manuf. 34, 969–977 (2019). https://doi.org/10.1016/j.promfg.2019.06.099
Article Google Scholar
Kumar, V.; Sudheesh Kumar, C.P.: Investigation of the influence of coloured illumination on surface texture features: A Machine vision approach. Meas. J. Int. Meas. Confed. (2020). https://doi.org/10.1016/j.measurement.2019.107297
Article Google Scholar
Boudiaf, A.; Benlahmidi, S.; Harrar, K.; Zaghdoudi, R.: Classification of surface defects on steel strip images using convolution neural network and support vector machine. J. Fail. Anal. Prev. 22(2), 531–541 (2022). https://doi.org/10.1007/s11668-022-01344-6
Article Google Scholar
Lou, C.; Atoui, M.A.; Li, X.: Novel online discriminant analysis based schemes to deal with observations from known and new classes: Application to industrial systems. Eng. Appl. Artif. Intell. (2022). https://doi.org/10.1016/j.engappai.2022.104811
Article Google Scholar
Zhang, S.; Member, S.; Li, J.; Li, Y.: Reachable distance function for reachable distance function for KNN classification. IEEE Trans. Knowl. Data Eng. (2022). https://doi.org/10.1109/TKDE.2022.3185149
Article PubMed Google Scholar
Shao, C., et al.: Feature selection for manufacturing process monitoring using cross-validation. J. Manuf. Syst. 32(4), 550–555 (2013). https://doi.org/10.1016/j.jmsy.2013.05.006
Article Google Scholar
Chen, Y.; Li, H.; Hou, L.; Wang, J.; Bu, X.: An intelligent chatter detection method based on EEMD and feature selection with multi-channel vibration signals. Meas. J. Int. Meas. Confed. (2017). https://doi.org/10.1016/j.measurement.2018.06.006
Article Google Scholar
Sun, L.; Hu, S.J.; Freiheit, T.: Feature-based quality classification for ultrasonic welding of carbon fiber reinforced polymer through Bayesian regularized neural network. J. Manuf. Syst. (2021). https://doi.org/10.1016/j.jmsy.2020.12.016
Article PubMed PubMed Central Google Scholar
Shao, C.; Kim, T.H.; Hu, S.J.; Jin, J.; Abell, J.A.; Spicer, J.P.: Tool wear monitoring for ultrasonic metal welding of lithium-ion batteries. J. Manuf. Sci. Eng. Trans. ASME 138(5), 1–8 (2016). https://doi.org/10.1115/1.4031677
Article Google Scholar
Bhat, N.N.; Dutta, S.; Pal, S.K.; Pal, S.: Tool condition classification in turning process using hidden Markov model based on texture analysis of machined surface images. Meas. J. Int. Meas. Confed. 90, 500–509 (2016). https://doi.org/10.1016/j.measurement.2016.05.022
Article Google Scholar
Vijay, G.S.; Pai, S.P.; Sriram, N.S.; Rao, R.B.K.N.: Radial basis function neural network based comparison of dimensionality reduction techniques for effective bearing diagnostics. Proc. Inst. Mech. Eng. Part J. J. Eng. Tribol., vol. 227, no. 6, pp. 640–653, (2013), https://doi.org/10.1177/1350650112464927.
Zhu, Q., et al.: Tropical forests classification based on weighted separation index from multi-temporal sentinel-2 images in Hainan Island. Sustainability 13(23), 13348 (2021). https://doi.org/10.3390/su132313348
Article Google Scholar
Momenzadeh, M.; Sehhati, M.; Rabbani, H.: A novel feature selection method for microarray data classification based on hidden Markov model. J. Biomed. Inform. (2019). https://doi.org/10.1016/j.jbi.2019.103213
Article PubMed Google Scholar
Jiang, B.; Zhu, B.: Dynamic Bhattacharyya bound-based approach for fault classification in industrial processes. IEEE Trans. Ind. Inf. 18(1), 397–404 (2022). https://doi.org/10.1109/TII.2021.3056533
Article Google Scholar
Shah, M.H.; Dang, X.: Novel feature selection method using bhattacharyya distance for neural networks based automatic modulation classification. IEEE Signal Process. Lett. 27, 106–110 (2020). https://doi.org/10.1109/LSP.2019.2957924
Article ADS Google Scholar
Sivakkolundu, R.; Kavitha, V.: Bhattacharyya coefficient target feature matching based weighted emphasis adaptive boosting classification for predictive analytics with big data. Mater Today Proc. (2021). https://doi.org/10.1016/j.matpr.2021.01.063
Article Google Scholar
Buchaiah, S.; Shakya, P.: Bearing fault diagnosis and prognosis using data fusion based feature extraction and feature selection. Meas. J. Int. Meas. Confed. (2022). https://doi.org/10.1016/j.measurement.2021.110506
Article Google Scholar
Haralick, R.M.; Dinstein, I.; Shanmugam, K.: Textural features for image classification. Trans. Syst. Man Cybern. (1973). https://doi.org/10.1109/TSMC.1973.4309314
Article Google Scholar
Bhat, N.N.; Dutta, S.; Pal, S.K.; Pal, S.: Tool condition classification in turning process using hidden Markov model based on texture analysis of machined surface images. Meas J. Int. Meas. Confed. (2016). https://doi.org/10.1016/j.measurement.2016.05.022
Article Google Scholar
Dash, S.; Jena, U.R.; Senapati, M.R.: Homomorphic normalization-based descriptors for texture classification. Arab. J. Sci. Eng. 43(8), 4303–4313 (2018). https://doi.org/10.1007/s13369-017-2961-9
Article Google Scholar
Soh, L.K.; Tsatsoulis, C.: Texture analysis of SAR sea ice imagery using gray level co-occurrence matrices. Trans. Geosci. Remote Sens. (1999). https://doi.org/10.1109/36.752194
Article Google Scholar
Abdel-nasser, M.; Melendez, J.; Moreno, A.; Omer, O.A.: Breast tumor classification in ultrasound images using texture analysis and super-resolution methods. Eng. Appl. Artif. Intell. 59, 84–92 (2017). https://doi.org/10.1016/j.engappai.2016.12.019
Article Google Scholar
Clausi, D.A.: An analysis of co-occurrence texture statistics as a function of grey level quantization. Can. J. Remote Sens. 28(1), 45–62 (2002)
Article ADS Google Scholar
Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Hum. Genet. 7(2), 179–188 (1936)
Google Scholar
Yen, G.G.; Lin, K.: Wavelet packet feature extraction for vibration monitoring. IEEE Trans. Ind. Electron. 47(3), 650–667 (2000). https://doi.org/10.1109/CCA.1999.801206
Article ADS Google Scholar
Kim, E.; (Andy) C. Tan; Yang, B–S.; Kosse, V.: Experimental study on condition monitoring of low speed bearings : Time domain analysis. Proc. Fifth Australas. Congr. Appl. Mech., pp. 108–113, (2007).
Vijay, S.; Srinivasa, G. P.P.; Sriram, N.S.; Rao, B.K.N.: Radial basis function neural network for effective condition monitoring of rolling element bearing. Int. J. Cond. Monit. Diagnostic Eng. Manag., 18, (3): 21 (2015)
Kilundu, B.; Dehombreux, P.; Chiementin, X.: Tool wear monitoring by machine learning techniques and singular spectrum analysis. Mech. Syst. Signal Process. 25(1), 400–415 (2011). https://doi.org/10.1016/j.ymssp.2010.07.014
Article ADS Google Scholar
Garg, M.; Dhiman, G.: A novel content-based image retrieval approach for classification using GLCM features and texture fused LBP variants. Neural Comput. Appl. 33(4), 1311–1328 (2021). https://doi.org/10.1007/s00521-020-05017-z
Article Google Scholar
Uddin, S.; Haque, I.; Lu, H.; Moni, M.A.; Gide, E.: Comparative performance analysis of K - nearest neighbour (KNN) algorithm and its different variants for disease prediction. Sci. Rep. 12(6256), 1–11 (2022). https://doi.org/10.1038/s41598-022-10358-x
Article CAS Google Scholar
Aydin, I.; Karakose, M.; Akin, E.: A multi-objective artificial immune algorithm for parameter optimization in support vector machine. Appl. Soft Comput. J. 11(1), 120–129 (2011). https://doi.org/10.1016/j.asoc.2009.11.003
Article Google Scholar
Ravi, K.C.; Srinivasa, P.P.; Vishwanatha, J.S.: Wavelet transform based recognition of machined surfaces using computer vision. Appl. Mech. Mater. 592–594, 801–805 (2014). https://doi.org/10.4028/www.scientific.net/AMM.592-594.801
Article Google Scholar
Vakharia, V.; Kiran, M.B.; Dave, N.J.; Kagathara, U.: Feature extraction and classification of machined component texture images using wavelet and artificial intelligence techniques. Int Conf. Mech. Aerosp. Eng. ICMAE 2017, 140–144 (2017). https://doi.org/10.1109/ICMAE.2017.8038631
Article Google Scholar
Rifai, A.P.; Aoyama, H.; Tho, N.H.; Md Dawal, S.Z.; Masruroh, N.A.: Evaluation of turned and milled surfaces roughness using convolutional neural network. Meas. J. Int. Meas. Confed. 161, 107860 (2020). https://doi.org/10.1016/j.measurement.2020.107860
Article Google Scholar

Download references

Acknowledgements

The authors recognize the support provided by the Manipal Academy of Higher Education (MAHE), Manipal, to the first author's doctoral research work. The authors acknowledge the language correction service provided by Prof. Dr. Srinivas G., Department of Aeronautical and Automotive department, MIT, MAHE, Manipal.

Funding

Open access funding provided by Manipal Academy of Higher Education, Manipal.

Author information

Authors and Affiliations

Department of Mechanical and Industrial Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
Ganesha Prasad, Vijay Srinivas Gaddale & Raghavendra Cholpadi Kamath
NITTE (Deemed to be University), Department of Mechanical Engineering, NMAM Institute of Technology, Nitte, Karkala, Karnataka, India
Vishwanatha Jampenahalli Shekaranaik & Srinivasa Padubidri Pai

Authors

Ganesha Prasad
View author publications
You can also search for this author in PubMed Google Scholar
Vijay Srinivas Gaddale
View author publications
You can also search for this author in PubMed Google Scholar
Raghavendra Cholpadi Kamath
View author publications
You can also search for this author in PubMed Google Scholar
Vishwanatha Jampenahalli Shekaranaik
View author publications
You can also search for this author in PubMed Google Scholar
Srinivasa Padubidri Pai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Vijay Srinivas Gaddale or Raghavendra Cholpadi Kamath.

Appendix A

See Fig. 9, 10, 11, and 12

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Prasad, G., Gaddale, V.S., Kamath, R.C. et al. A Study of Dimensionality Reduction in GLCM Feature-Based Classification of Machined Surface Images. Arab J Sci Eng 49, 1531–1553 (2024). https://doi.org/10.1007/s13369-023-07854-1

Download citation

Received: 17 September 2022
Accepted: 20 March 2023
Published: 12 May 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s13369-023-07854-1

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Study of Dimensionality Reduction in GLCM Feature-Based Classification of Machined Surface Images

Abstract

Similar content being viewed by others

Survey on SVM and their application in image classification

Considerably Improving Clustering Algorithms Using UMAP Dimensionality Reduction Technique: A Comparative Study

Advances of Four Machine Learning Methods for Spatial Data Handling: a Review

1 Introduction

2 Literature Survey

2.1 Machined Surface Texture Analysis

2.2 Dimensional Reduction Techniques (DRT)

3 GLCM

4 Feature Selection using Distance-Based Methods

4.1 Fisher's Criterion (FC)

4.2 Separation Index (SI)

4.3 Bhattacharya Distance (BD)

4.4 Threshold Criteria Function (TCF)

5 Machine Learning (ML) Models

5.1 Decision Tree (DCT)

5.2 K-Nearest Neighbor (KNN)

5.3 Random Forest (RAF)

5.4 Support Vector Machine (SVM)

6 Image Dataset and Pre-Processing

7 Selection of Sensitive Features

7.1 Selection of Sensitive Features using Fischer’s Criterion

7.2 Selection of Sensitive Features using Separation Index

7.3 Selection of Sensitive Features using Bhattacharya Distance

7.4 Selection of Sensitive Features Common to FC, SI and BD

8 ML-Based Classification

9 Conclusions

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding authors

Appendix A

Appendix A

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation