1 Introduction

Manufacturing which is an age old technology went on to become ‘Intelligent Manufacturing’ (IM) after the advent of CAD, CAM and CAE as well as robot. In early twentyfirst century with invent of smart sensors and internet, the IM system emerged to ‘Smart Manufacturing’ to provide a better quality and improved productivity with a reduced lead time [1]. This has been described as the fourth industrial revolution ‘Industry 4.0’. Smart design, smart machines, smart monitoring, smart control and finally smart scheduling can be achieved by Industry 4.0 that should take automation to the highest level [2]. Apart from this, to fulfill the highest level of automation in manufacturing industry, the work piece handling time and machine downtime should be reduced by fully automating the concerned hardware. Tool monitoring system is one such solution that reduces the downtime required to check the tool status. But at present, indirect methods of monitoring tool wear are the best alternatives to direct methods. These methods save a considerable amount of time as the tool need not be removed to check its wear and thus is economic to industry. The cost of tool monitoring system can be justified against the amount spend in machine downtime for checking the tool status. Artificial Intelligence (AI) plays an important role in Industry 4.0 by providing capability of self-learning networks by using component data to predict its future [3]. A combination of indirect TCM methods with latest AI technology like the Convolutional Neural Network can greatly improve the performance of the manufacturing system. Thus the use of IoT and cloud system for smart manufacturing units can fulfill the objectives defined by Industry 4.0.

As cutting tool machines the work piece surface, it leaves its information on the work piece in the form of feed marks, this feed marks form the waviness on the work piece surface and is termed as surface roughness. This variation on the surface roughness is known as texture in terms of machine vision. This surface texture varies as the tool wear progresses [4]. The limitations of stylus method [5] to measure the surface roughness have been greatly overcome by image processing based methods. Gadelmawla et al. [6] used a novel technique to extract the height of surface roughness from captured images and presented it in gray levels. A 3D view of these gray level display scratches on the work piece and thus roughness properties were determined from it. Wavelet transform was used by Chen et al. [7] to evaluate surface roughness of the machined components. Authors developed a mathematical model to abstract surface roughness information and opened a new area of wavelet for roughness measurement. Frequency normalized wavelet transform was used on machined surface images by Josso et al. to analyze the surface roughness characterization [8]. A novel online method for estimation of cutting tool flank wear using fractal properties of machined surface was presented by Bukkapatnal et al. [9]. Authors used a set of sensors to collect the information on cutting force, vibration and frequency that was used to extract the fractal dimensions and ultimately flank wear using recurrent neural network. Hough transform was implemented on the texture of machined surface images to detect line segments formed during machining [10, 11]. Results showed a strong correlation between the features extracted by Hough Transform and tool wear using multilayer perceptron neural network. The research opened a new area in the form of transform to be used for monitoring tool wear.

Analysis of surface texture of turned surface images by Voronoi tessellation (VT), a concept in computational geometry was performed by creating Voronoi diagram by Datta et al. [12]. Authors extracted two texture feature namely, total void area of Voronoi diagram and number of polygons with zero cross moment. A correlation between measured flank wear and the extracted features depicted that the method can be applied to monitor the flank wear in the cutting tools. In another work of Dutta et al. [13] used a combination of three feature extraction methods, viz; Gray Level Co-occurrence Matrix (GLCM), VT and discrete wavelet transform was applied on the turned surface images and textural features were extracted. These features were then fed to regression model of support vector machine to predict the flank wear that showed a good correlation between features and flank wear. A novel work by use of Gabor filter was performed to implement Gabor filter for tool wear analysis [14]. Various statistical parameters were determined and their correlation with the measured surface roughness values of the machined surface was presented.

There are few other techniques that can work well on edges in the machined surface images by extracting important features and determine remaining useful life of a cutting tool. The techniques like contourlet transform [15, 16] was used to detect lines or contours in an image and dual tree complex wavelet transform [17, 18] which extract rotation invariant features are still not explored for tool wear monitoring. CNN is gaining popularity as it is a combined feature extraction and classification machine learning tool. A fault in gear box generates vibration signals that can be used to diagnosis the various fault patterns. Chen et al. [19] carried out fault diagnosis experimentations on gear box and used SVM and CNN classifier and found that CNN has a better fault identification and classification rate. In order to identify fault in a bearing, a fault detection system based on CNN was developed by Janssens et al. [20]. The raw signals obtained from the vibration data were subjected to feature extraction and further classification by random forest classifier as well as CNN. The high rate of accuracy opened a new machine learning tool in manufacturing. Aghazadeh et al. [21] employed force, vibration and spindle current as input to different regression algorithm in a first work of its kind and found that CNN outperformed other classifiers between all three signals and discovered that CNN can be suitably applied for TCM. In a similar work Terrazes et al. [22] supplied real time raw data of cutting force, as input to CNN that resulted in an accuracy of 78% in prediction of tool wear. Experiments were performed by Gouarir et al. [23] to measure cutting force and predict the flank wear of milling cutter using CNN. A confusion matrix showed that there is just 10% deviation from predicted and actual tool wear. In a recent paper, Martinez et al. [24] attempted for a tool monitoring system to classify tool in four predefined class. The data was fetched from force and vibration sensor and provided to CNN classifier that was justified by an accuracy of above 90%.

Literature reveals that different researchers have used different methods to extract textural features and monitor the status of a cutting tool. Few researches presented the combination of two or three techniques in order to improve the accuracy of monitoring system. Most of the research is focused on data from sensor, which has ambiguity in the form of noise, clarity, reliability, complexity etc. These are the inherent features when data is in the form of numbers. Alternatively, if the data is in the form of images, there is least chance of such ambiguity provided proper data acquisition is implemented. Even though some work is done on images, there are few techniques that are still not been able to grab the attention of the researchers. Although CNN has been applied on data like force, vibration, current etc. obtained from sensor but application of CNN on images is not yet studied for TCM. CNN technique has a capability of both, feature extraction and classification when images are used as input. Other image processing techniques needs at least two methods; one for feature extraction and other for feature classification. Use of CNN terminates the requirement of two separate techniques for feature extraction and classification including the need for probing the best technique in each case that can improve accuracy of algorithm. CNN is a novel approach used for monitoring of tool wear based on components images that are all identical. Therefore in this pa0per a tool monitoring system based on CNN is presented to predict life of cutting tool. This approach demonstrates that CNN method of AI for classification of tool degradation responds sufficiently well when used as an indicator for monitoring tool wear. This machine vision based indirect TCM technique generates a large amount of data in the form of images. The handling of such a huge data is difficult to process by regular CPUs and the processing time may be in hours. The issue can be efficiently treated by use of GPUs operated from clouds which will reduce the processing time by a considerable amount [25]. The techniques used for features extraction and classification are detailed in Sect. 2. Experimentation is described in Sect. 3, whereas results and discussion are presented in Sect. 4. Finally, Sect. 5 presents conclusions and future work.

2 Feature extraction and classification

After the data is acquired by means of sensor or in the form of images, the most important labour is to analyse the data. The data must have characteristics like clarity, unambiguity and should be mutually exclusive. This needs detail study on the kind of data available as the correct pre-processing method has to be applied before using a feature extraction techniques.

2.1 Feature extraction

Quite often the data collected contains noise, repeated information or is unlabeled or have high dimensionality. Noise in case of images can be due to improper illumination, inefficiency in the technologies used to collect the data and the improper ways to capture it. A huge amount of data means higher processing time and large memory requirement as well as a threat of over fitting of data. Such a raw data has to be processed before any feature extraction process is applied to it. Feature extraction is a process of reducing the dimension of the raw data so that it becomes easy to process further. It is a technique to combine variables into important features, by removing the redundant data, still describing the original data set. Various feature extraction techniques studied in the literature are wavelet transform, Gray Level Co-occurrence Matrix, Fractals, Principal Component Analysis, CNN.

2.2 Features classification

The extracted features have to be classified into a particular class with accuracy. Feature classification is a technique used to categorize a huge number of data into different classes. The popular techniques for feature classification are Support Vector Machine, Random forest, Decision tree, K-means and CNN.

2.3 Traditional and Convolutional Neural Network

Neural network are data driven AI techniques that are used to find the relationship between the input data and output classes. In traditional neural network models, there is a separate tool for feature extraction and feature classification. Deep learning integrates feature extraction, feature selection and features classification in one model. The filters (kernels) are selected automatically by the model depending upon the feature. For example, if a feature is a plus (+) sign the filter corresponding to ‘+’ is selected, if a feature is backward slash (\), then the filter corresponding to ‘\’ is selected. The filter learns from the image, tune the parameters and reconstruct it till it gets the optimized result. The traditional technique uses feature extraction by transforming the data in spatial or frequency domain. Feature selection is then performed to remove the redundant features and feeding it to the classifier, which then classifies data into given classes.

2.4 Architecture of CNN

In a basic NN model, each neuron from (n − 1) layer is connected to all the neurons in the nth layer that forms a complex structure of neurons as shown in Fig. 1. CNN, a deep learning tool is a feature based approach that uses multiple layers to process data similar to neural networks. A typical CNN consists of a convolution layer, pooling layer and a fully connected layer between the input layer and the output layer as shown in Fig. 2.

Fig. 1
figure 1

A basic neural network

Fig. 2
figure 2

Architecture of CNN [26]

The CNN approach of deep learning can extract the textural features automatically. It has been designed in such a way that the kernels modify itself to suit the learn by itself. The CNN approach has been successfully applied on raw data and classification accuracy is high.

The overall training process of the Convolution Network may be summarized as below:

Step 1 The decided filters and weights are assigned random values.

Step 2 The input image is then taken by the network. Two operations are performed on this image (convolution, ReLU and pooling operations) and a feature map is generated.

Step 3 The obtained feature is then fed to another layer of NN where again the operations like convolution, ReLU and pooling are performed and another feature map is generated. The number of feature maps depends upon the number of layers. Each feature map is stacked over the other.

Step 4 The set of feature maps is fed to fully connected layers that finds the probability for each class. Let’s say the probabilities found are [0.3, 0.2, 0.2, 0.3]. Since weights are randomly assigned for the first training example, output probabilities are also random. e.g. If four images of cat, dog, tiger and leopard are in four different classes and the input image is of cat that belongs to Class-A, then the output probabilities should be [1, 0, 0, 0].

Step 5 The network then calculate the error between input and output layer. This error (if any) is then back propagated and all the filters weights are updated. Then the steps 2 to 4 are performed again and the output probabilities now might be [0.8, 0.1, 0.0, 0.1]. Thus the network learns to adjust weights and filters and classify the image in a correct class.

Step 6 The above steps are then repeated with all the images in the training data set.

An image is given as input to the CNN, where it is first processed by the convolution layer, Relu function, pooling operation and activation function.

Convolution layer The primary purpose of convolution in case of a ConvNet is to extract features from the input image. Convolution preserves the spatial relationship between pixels by learning image features using small squares of input data. The input to convolution layer, which is a linear operation, is the raw image with known class and the output is a feature map.

ReLU function The feature map so obtained is passed through a nonlinear operation ReLU (Rectified Linear Unit). ReLU is applied to each pixel and it replaces all negative pixel values in the feature map to zero as the data in real world is nonlinear. The output of ReLU is a rectified feature map.

Pooling This operation reduces the dimensionality of each feature by retaining the important information in the input image. The step is also called as sub sampling or down sampling. Pooling applies a filter to the rectified feature map and reduces the size of the image by selecting those pixels which best describes the image. Pooling control over-fitting of an images and represents an almost scale invariant image.

Fully connected layer This is a classification layer where every neuron in the previous layer is connected to every neuron on the next layer. The output from the pooling layers is used to classify the input image into various classes based on the training dataset.

3 Experimentation

This section introduces the experimental setup and design for the TCM system. The experimentation encompasses the hardware and software systems and their setup, and shows the machining parameters and tool conditions considered in the experiments.

3.1 Machining setup and conditions

The experimental set up consists of MTAB made CNC machine, component under study of diameter 32 mm and length 110 mm mounted in a chuck of the machine and a cutting tool to remove material from the component by turning operation. Mild Steel was used as the component material and cutting tool used was right hand coated carbide insert. Chemical analysis of cutting tool was performed in the lab that gives the composition of various elements as shown in Table 1.

Table 1 Chemical composition of carbide insert

For machining the component, a CNC program was executed using Fanuc controller in dry cutting conditions without coolant. Dry machining was employed due to the limitation of non-availability of accelerator to wear out the tool quickly. Also the cutting parameters, as shown in Table 2 were kept fixed so as to have consistency in obtaining the results and verify the CNN classifier. Before initialisation of the experimentation, a rough cut was taken on all the components to remove the surface layer. All rough cuts were taken by a separate tool that was not used during experimentation. During experimentation, the cutting speed was kept low due to the limitations of the machine. Each tool under study was used till the wear reached 300 µm and then the tool was replaced by a fresh one.

Table 2 Experimental conditions

3.2 Tool wear measurement

Measurement of tool wear is carried out in steps as mentioned in the following algorithm.

  • Step 1: Perform turning operation on CNC lathe machine as per the part program.

  • Step 2: Remove tool and component from machine.

  • Step 3: Capture component image on microscope.

  • Step 4: Capture tool image and find flank wear (FW).

    1. (a)

      If 0 ≤ FW ≤ 100 µm, save component image in class ‘A’.

    2. (b)

      If 101 ≤ FW ≤ 200 µm, save component image in class ‘B’.

    3. (c)

      If 201 ≤ FW ≤ 300 µm, save component image in class ‘C’.

    4. (d)

      If FW > 300 µm, tool is worn and needs to be replaced.

The tool wear set up (Fig. 3) consists of microscope for taking images of tool and work piece, camera mounted in the microscope, weighing machine to weigh the tool, software and computer to produce the wear images.

Fig. 3
figure 3

Setup of tool wear measurement

Table 3 shows the specifications and specifications of used microscope as well as the related information about the software.

Table 3 Microscope specifications and software information

The weight of carbide insert was taken before machining by a weight balance. After machining for around 5 min (depending on the part program), the cutting tool was removed from the tool holder. The insert was cleaned by acetone to remove any chips clogged in it. The weight of tool was then measured to know the weight loss. The tool image was then taken by using a microscope to check the amount of flank wear. The component was also removed from the chuck and cleaned by acetone. The component was then kept on the inverted microscope and images at different positions were taken and stored in the computer. The tool wear was measured by the inbuilt function using the software after every cut, just to identify the amount of wear and mark it for the particular image. The measurement of wear as displayed in software is shown in Fig. 4.

Fig. 4
figure 4

Direct measurement of tool wear by software

Around 1183 images of the components were taken by using the microscope. All the images were categorised in either ‘class A’, ‘class B’ or ‘class C’. The classification depends upon measurement of tool wear by microscope, every time the component image was taken. If the tool wear is less than 100 µm, then the component image was classified as class A image. For tool wear from 100 to 200 µm, the component images were classified into class B and for the tool wear of 200–300 µm the component images were classified into class C. The magnified images of tool wear are shown in Fig. 5 for different wear zones.

Fig. 5
figure 5

Actual tool wear by direct measurement

Variation in the images was seen as the tool wear advanced. The images of component when the tool wear is less show the feed mark quite clearly. As the wear of insert progresses, the images become more diffused. The feed marks are not clear and the region between two feed marks was also occupied as visible from Fig. 6.

Fig. 6
figure 6

Images of component belonging to different classes

4 Results and discussion

The implementation was done in MATLAB software (R2019b) in which CNN has been applied to the 1183 images taken by a microscope. ResNet-50 (short form of residual network) network, which is a three layer deep network, was used for training. The number of images used for training and testing was taken in the ratio of 70:30.

The training cycle had six epoch with 49 iterations per epoch and the total iterations carried out were 294. The accuracy in training at intermediate interval is shown in Fig. 7a–d where ‘accuracy’ is represented in percentage along vertical axis and horizontal axis represents the number of iterations. The loss in training per iteration is shown in Fig. 8a–d where ‘loss’ is represented along vertical axis and horizontal axis represents the number of iterations.

Fig. 7
figure 7

Accuracy in training iteration

Fig. 8
figure 8

Loss in training iteration

Different metrics used to measure error for quantifying the performance of CNN are precision, recall, f1_score and finally accuracy. In order to determine these metrices, Eqs. (1) to (4) were used. Here TP represents the number of true images classified as true, TN represents the number of false images classified as false, FP represents the number of true images classified as false, FN represents the number of false images classified as true. A good classifier will aim to maximize the value of precision as well as recall, so that images are classified in actual class only. The combination of precision and recall is expressed directly in the f_score, whereas accuracy gives direct interpretation of the classifier as it is the ratio of the number of images classified correct to the total number of images provided to the classifier.

$${\text{Precision}} = \frac{TP}{TP + FP}$$
(1)
$${\text{Recall}} = \frac{TP}{TP + FN}$$
(2)
$${\text{F}}\_{\text{score}} = 2\frac{precision \times recall}{precision + recall}$$
(3)
$${\text{Accuracy}} = \frac{TP + TN}{TP + TN + FP + FN}$$
(4)

The results achieved after running all the iterations are summarized in Table 4.

Table 4 Performance result of CNN training

In order to check the accuracy of the developed model, it was tested on unused images that belong to different classes. Six images of components from each class those were identified from direct measurement of tool wear as described in Sect. 3.2 were used as input to the CNN model. The results of the validation phase are given in the Table 5 and plotted in Fig. 9. The results shown are depicting maximum value from each class.

Table 5 Performance result of CNN for fresh images
Fig. 9
figure 9

Classification of fresh images in respective classes

Once the CNN model is created and tested on images, it can be practically applied to real manufacturing. The developed model based on CNN can be successfully implemented for monitoring the status of cutting tool. A flow chart given in Fig. 10 represents the steps to be followed while its implementation.

Fig. 10
figure 10

Flow chart representing implementation of TCM using CNN

5 Conclusions

The most tedious job in any machine learning system is to select the suitable filter to remove redundant data, apply proper feature extraction technique that should be suitable for the selected data and select a correct classification system which will provide a higher accuracy. There are many solutions that can possibly perform this process. But all solution may not fit well, which can be understood by poor classification accuracy. Also, the selection of features and classify them manually is less efficient than automatic method like CNN. The CNN model used in this work provides a better solution to classify the tool into different wear class. A CNN learns by itself and rectify the errors and the process is continued because of which it has high accuracy rates. The CNN has an ability to extract feature, select required features from the extracted ones and classify the data into the required number of classes.

The confusion matrix provides an accuracy of 87.26% which proves that CNN can be used in a better way for image classification. The tool wear images and values obtained by microscope shows a gradual increase in wear. Also the weight of the tool is lost during each set of machining. Thus a complete tool monitoring system to predict a remaining useful life of a cutting tool is possible by CNN technique of image processing.

The method explained in the paper is an offline technique since microscope was used for capturing images, but if camera is mounted on CNC machine to capture images then the proposed method can be implemented as an online technique for monitoring tool wear. Further, accuracy of the CNN model can be improved if the number of images in each class are more than 2000. Additionally the computational time required for training the model can be drastically reduced from around 280 min to few minutes, if graphical processing unit (GPU) is used for computation purpose.

The results obtained by the above methodology can be practically implemented in real machining in the manner described. For a particular set of cutting condition, the algorithm stated in Fig. 10 may be implemented. In order to make process online a camera could be fitted on the CNC machine to take photos in real time, along with a computer for processing. Further after a preset interval of time, the mounted camera will click image of component and will be saved in computer. The computer will then process the image using CNN and the trained model will give the tool status. This can happen in real time and tool life can be monitored.