1 Introduction

The artificial intelligence (AI) technologies like Machine learning and Deep learning have enabled a rapid transition to a new industrial era called Industry 4.0. The idea behind applying these technologies to enable industries to become self-aware and self-reliant is called Smart Manufacturing. Smart Manufacturing is a broad term that is used to describe the convergence of Information Technology and Operation Technology. Several terminologies like Industry 4.0, Industrial Internet of Things, and Factories of the Future are used to depict Smart Manufacturing. Currently, with the advancement in sensors and AI technologies, Smart Manufacturing has entered a phase of rapid growth and development.

Computer numerical control (CNC) machine tools is widely prevalent in many manufacturing industries because of their ability to automate the control of machine tools using a computer. Thanks to their ubiquitous presence, energy consumption monitoring of these tools has become a topic of interest. Several studies have been carried out to track and monitor the energy consumption of these machine tools and to develop energy saving strategies [1,2,3,4,5,6]. Energy consumption monitoring can lead to the development of energy saving strategies. They can also be used as an approach to detect equipment anomalies, or at the very least complement anomaly detection on machines. Studies have been carried out to monitor the equipment’s condition using the signals from the sensors that were retrofitted to the machines [7,8,9]. These studies, even though they can predict equipment’s condition with remarkable performance, cannot determine the working status of the equipment. The working status of an equipment include but not limited to the operating status of the equipment’s auxiliary components and equipment’s operating parameters like feed rate, spindle speed, etc. A huge portion of the working status information can be found blended in the equipment’s energy consumption. By identifying the related patterns in the energy consumption data, we would be able to identify the equipment states, predict energy consumption, and can also augment the signals used in anomaly detection.

To obtain information in real-time from the CNC machines there are two different approaches that can be followed based on the versatility of the machine. The first approach is to directly interface with the machine through its controllers. He et al. [10] used OPC UA (Open Platform Communications Unified Architecture) client to interact with the machine and obtain data on feed rate, spindle speed, instantaneous axis loads, and depth of cut to build a deep learning model to predict the energy consumption. This method performed well, as the model was able to perform with good accuracy, but the main drawback behind this approach is that the universal applicability of the models. The developed models require the data to be presented to them in the exact same manner as they were developed, meaning that it is required to have access to the equipment’s controller to procure the data. This limits the model’s application. The second approach is to retrofit the machine with sensors that could procure data at the required sampling rate and have the ability to transfer it to IWSNs (Industrial Wireless Sensor Networks) for remote monitoring. This approach is not specific to a machine type and can be generalized. Furthermore, due to the flexible nature of the retrofitting, the developed AI models can be made robust and modular enabling universal applicability.

Retrofitting a legacy machine is not a novel concept, various studies have been conducted in the past that explore its prospects. Eoi et al. applied k-nearest neighbor (K-NN) algorithms to the energy data to identify different operating status of the machine such as idle mode, run mode, production mode, etc. [11]. Han et al. retrofitted the machine with additional sensors to predict the cutting power [12]. The sensors used were accelerometer, dynamometer, and servo encoders. The above studies performed well in terms of achieving reliable results, but the main drawbacks were lower prediction accuracy, in the former one, and increase in the system complexity, in the latter one. In addition to the previously mentioned drawbacks, the model developed in both studies lack the ability to be generalized. Johannes et al. proposed a sensor reduced machine learning approach where the number of sensors used to predict equipment’s energy consumption is reduced, and the model developed provided good results [13]. Prediction of equipment states with a smaller number of sensors is preferred, as it brings down the cost of AI implementation. In the work presented in this paper we aim to develop models that could achieve reasonable prediction accuracy with limited number of sensors retrofitted to the machine.

The energy input to the machine contains a lot of valuable information whose underlying patterns can be explored to obtain information on axis feed rate, spindle rotation speed, and cutting force. Manual feature extraction is difficult and cannot fully exploit all the patterns associated with the signal. Hence, deep learning techniques are adopted to automatically exploit the features of interest. This paper proposes an intelligent energy consumption and equipment states monitoring system using 1D-CNN. The model uses the equipment’s energy consumption data, which is obtained using retrofitted sensors and external power analyzers to extract the patterns corresponding to different equipment operating states. These are then used to identify the different operating states of the machine and will be used to detect anomalies associated within the equipment’s operation down the line. The paper has the following structure: In Sect. 2, the energy characteristic of the machine tool is introduced. In Sect. 3, the methodology used to identify the different working states of the machine is presented. This is followed by the development of 1D-CNN model and hyper-parameter optimization in Sect. 4. In Sect. 5, case of the 1D-CNN model is presented along with the comparison of the model performance with different machine learning algorithms. Finally, in Sect. 6, conclusion and future work are presented.

2 Energy Characteristic of Machine Tool

In a machine tool, the fundamental elements of energy consumption comprise of the motors that control the axes, main spindle, controller, and finally the accessories like cutting fluid pump, mist collector, etc. The power consumed by the machine can be broadly classified into three segments based on their dependence on the cutting process. The first one is the power consumed by the accessories and machine’s controller. The second one is the power consumed by the drives controlling the axes, spindle, etc., which are generally dependent on the operation process parameters. The third one is cutting energy consumption that is mainly affected by the cutting parameters and the material types. The three segments of energy consumption on a typical CNC machine are shown in Fig. 1.

Fig. 1
figure 1

CNC sub-components and their associated energy consumption

The different components within a CNC machine, in terms of their power consumption, can be broadly classified as shown in the Fig. 2. In the case of ultra-precision machines, the power consumption depends on the position of each axis with respect to the datum. Since these machines are highly sensitive to minute variations, the position of each axis will have an impact on the power consumption.

Fig. 2
figure 2

Energy flow of a typical CNC machine tool

Both active and reactive power are considered in our analysis. Active power, also called the real power, is the net power transferred in one direction. The power consumed while cutting the material is a part of the active power. The energy input to the machine can take different forms such as pneumatic, electrical, mechanical, hydraulic etc. Each component that does the conversion can be thought of as a resistance load in series with the energy input. By determining the instantaneous energy input and the instantaneous energy output, the energy consumed by the equipment can be computed. Since the equipment can take different states, it is imperative to identify the states to determine the power consumed by the machine.

3 Methodology

The first step in the AI model development process involves establishing the operational baseline of the machine under study. To achieve that, power consumption profile was determined for different equipment operating states. These states include the operation of each of the 5-axes of the machine X, Y, Z, B, and C, and operation of equipment accessories like cutting fluid pump and mist collector. Once the baseline was established, the combination of equipment operations was explored to augment the model development process. From the analysis, as mentioned earlier, the power consumed by the accessories was assumed to be constant for our study, even though changes in parameters like temperature, etc., can affect them. Whereas, the power consumed by the axes was dependent on the equipment’s feed rate. Hence additional experiments were conducted where the power consumption of the machine was determined for different feed rates. The increment in feed rate for each experimental run depends on the range of feed rates offered by the CNC machine and the data generated was used to train the AI model. The model would enable us to understand the relationship between the power consumed and the equipment’s working condition.

As shown in Fig. 3, the input to the AI model is the power consumption data sampled at 10–20 Hz. The model is trained such that it can predict the different working components of the machine as well as the feed rate at which a particular axis was moving. The time series power consumption data corresponding to the axis motion was segmented using a sliding window. The segmented data was then sent to the 1D-CNN model such that each input node in the input layer of the 1D-CNN corresponds to a single instance of the data point.

Fig. 3
figure 3

Model development process for equipment’s working status identification

The 1D-CNN model was trained such that it could predict the working component. The time series data along with the model prediction result was then used to determine the time a particular component was on. The power consumption data along with the time was later used to determine the energy consumed by the working component.

3.1 Model Development Process

The model development process can be split into three segments: (1) data preprocessing, (2) model creation, and (3) model selection. The process is illustrated in Fig. 4.

Fig. 4
figure 4

Process flow for the model development

Data preprocessing: The voltage and current signals at each input phase of the machine was sampled at 3 MS/s (Mega samples per second). The sampled signal was then passed through an anti-aliasing filter. The power consumption of the machine was then determined followed by averaging the data for every 50 or 100 ms interval. The data were then passed through a median filter to remove the burst noise caused by the electric grid. The computational requirement of the median filter was low thereby enabling real time data filtering. The data were then segmented every 5 s, and this constitutes a single input instance to machine learning/deep learning models. The segmentation time of 5 s was an hyperparameter and was determined considering the speed at which the predictions are required from system. In the case of machine learning model development, the segmented data was further processed to extract time domain features followed by applying Principal Component Analysis (PCA) to reduce the dimension of the data and identify the features of importance that preserves the variance in the dataset. In the case of deep learning model, the segmented data was concatenated with different power components of the machine before using it in the training process. For all the models developed in this study, the data was categorized into training, validation, and testing. The results shown in the upcoming sections were from applying the models on the testing data sets, which was not seen by the model during the training period.

Model creation: The prime reason behind using deep learning was to enable the model to extract the features of interest on its own. The segmented 1D signal was sent to the model which applies convolution on the time dimension. The convolved signal passes through the layers of the 1D-CNN until it is classified among the several working states of the machine at the softmax layer. The softmax layer is typically used in a multi-class model, and the operation behind this layer is the softmax function. The softmax function takes in a vector of N real values from the previous fully connected layer and converts it to N real values between 0 and 1, so that they can be interpreted as probabilities. The several hyper-parameters involved in the model development process like layer count, learning rate, batch size, kernel size, etc., are optimized using an iterative process and a Bayesian Optimization technique. This was done to ensure that the model is not stuck at a local minimum and explores the gradient space better. The structure of the proposed 1D-CNN model is shown in Fig. 5.

Fig. 5
figure 5

Structure of the proposed 1D-CNN model

Model selection: Several different model architectures were developed and tested to identify the model that performs the best for our scenario. In addition to the 1D-CNN, the other model that performed well in our case is a 2D-CNN model. The data input to the 2D-CNN is the segmented time series plot as an image with the time on the x-axis and power value on the y-axis. The performance of the 2D-CNN model was comparable with the 1D-CNN, but the 1D-CNN models was computationally efficient. In addition to the deep learning model, several other machine learning models were also developed. For the machine learning model development, feature extraction is particularly important as they have a huge impact on the model’s prediction accuracy. Hence in this case domain expertise was used to identify the important features, the selected features were extracted from active power, reactive power, apparent power, and three phase current input. A feature importance study was conducted by using feature permutation, a method where individual input features to the model were randomly shuffled to determine their impact on the model’s performance. This study enabled us to identify the feature that had the most impact, and it was identified to be active power and the three-phase input current. The ML models that were developed for this study are listed in Table 5. As the final stage of the model development process, the best performing model in terms of their testing data prediction accuracy and loss was selected.

4 Experiment Setups and Design

The 5-axis ultra-precision CNC machine tool whose power consumption data was used is shown in Fig. 6. The 5-axes, X, Y, Z, B, and C are driven by servo motors. The linear axes, X, Y, and Z can achieve a resolution of 1 nm, whereas the rotation axes, B and C, have a resolution of 1 micro degree. The specification of the machine can be seen in the Table 1. The experiments conducted involve attaching power analyzers to the power input of the machine. To ensure model robustness, two power analyzers of different models (WT5000 and WT333E from Yokogawa) having different resolutions were used to measure the input power to the machine. The averaging intervals of the power analyzers were set such that former one outputs data at 20 Hz and latter one outputs at 10 Hz. Also, the resolution of the former one was higher than the latter. The detailed information on both power analyzers can be found in Table 2. The input power measurement process for the machine was 3P3W (Three Phase and Three Wire) method and connection layout is shown in Fig. 7. The data was acquired through serial communication for both units.

Fig. 6
figure 6

5-axis ultra-precision CNC machine used in the experiment.

Table 1 Specification of FANUC ROBONANO α-0iB
Table 2 Specification of power analyzers
Fig. 7
figure 7

3P3W interface with the CNC machine.

In the case of an ultra-precision machining, cutting power accounts for only a small part of the total power consumed. This is primarily because of the material removal rate (MRR) being small in the case of ultra-precision machining. The working components of a 5-axes FANUC ROBONANO α-0iB comprises of the controller, five axes—X, Y, Z, B, and C, mist collector, and pump. These components chosen were crucial to the machine’s operation and are primarily used in a cutting operation by the machine. Based on the preliminary work conducted, the above-mentioned components consume a major portion of energy involved in the equipment’s operation. In this work, the power consumption by the controller, the cutting fluid pump and the mist collector was assumed to be constant over a cutting cycle, whereas the power consumed by the five axes was variable and was dependent on the travel feed rate. Two different sets of experiments were conducted to establish the power consumption baseline for the equipment. The first set of experiments were conducted to determine the power consumed by the working components of the machine without considering their dependency on the feed rate, whereas the second set of experiments considers the impact of feed rate. Information on the two sets of experiments conducted can be seen in Tables 3 and 4. The data collection process for the experiments was automated. An application was developed as a part of this work that interprets the machine g-codes and the serial data from the power analyzers simultaneously, thereby synchronizing them. This enabled us to associate the different segments of the power consumption data, sampled by the power analyzers, with the machine states. The synchronized data was then further categorized into training, validation, and testing for the model training process.

Table 3 Identified working components
Table 4 Experiments conditions

5 Results and Discussion

Since the experiments were conducted in a controlled environment and the noise was minimized, the data set was split into 70, 20, and 10% for training, validation, and testing, respectively. The data used for model development was time series power and current consumption for all three phases of the machine. Depending on the type of algorithm used, either feature extraction or feature learning was implemented for the model development process. The model was trained using the training data, and at the end of every epoch the model’s performance was tested using the validation data. An early-stopping technique was adopted during the training process, meaning that the training was stopped when the model’s performance on the validation data started decreasing. This prevented the model from over-fitting. The model was developed to predict the operation status of the axis, and the feed rates from the energy consumption data. Once the training process was completed the model’s prediction performance was tested on the testing data, and the results are listed in Table 5. As can be seen from the same table, the model’s prediction performance was also compared across two difference power analyzers, WT5000 and WT333E from Yokogawa. The comparison was made to determine the impact different types of data acquisition devices have on the model’s performance.

Table 5 Model performance

The classification accuracy of the best performing machine learning model was around 56.0%. On the other hand, the classification accuracy of the 1D-CNN was close to 95.7% (see Table 5) and its confusion matrix is shown in Fig. 8. The difference in performance between the machine learning and deep learning models can be attributed to the fundamental nature of these models. In the case of machine learning models, the features used in the model development process were handcrafted time domain features, whereas, in the case of the deep learning models the feature learning is done by the model. Hence the deep learning models are better able to identify and fit the patterns in the dataset. In addition to the inherent advantage in the process of feature learning compared to feature extraction, the 1D-CNN model, due to its architectural design, tends to process the time series signal more effectively compared to other deep learning models. This can be attributed to the fact that the temporal resolution is being maintained due to the 1D convolution operation carried out in a 1D-CNN. Hence, the time dependence of the input data is better preserved as it passes through different layers of the model. Since the input data in our case has a strong dependence on time, we believe 1D-CNN would be the better architecture for this problem domain.

Fig. 8
figure 8

Confusion matrix for working components identification

The results presented so far were used to identify the working status of the machine. This work will enable us to determine the components that are operational at any point in time from the input energy data of the machine. The later analysis presented in this work was used to validate the ability of the models to identify the different feed rates for each axis from the energy data. The training and validation loss of the developed model are shown in Figs. 9 and 10, respectively. The training loss tends to decrease with the epochs, whereas the validation loss remains approximately constant after 300 epochs. The same can be said about the validation accuracy, as above 300 epochs it remains constant. This indicates that training the model over 300 will lead to overfitting and eventually a decline in the performance. The loss and the accuracy plots were used to obtain insights into the model’s training process, thereby helping us in tuning the hyper-parameters for model development. The confusion matrix in Fig. 11 shows the classification results of each class listed in Table 4. Since the feed rate identification study was a precursor to the development of energy prediction models, it was currently considered to be a classification problem. The entire range of feed rates for each axis of a CNC machine was divided into 7 bins. The data were collected for each of the bins and were used in the development of the classification model, for a total of 34 classes. In Fig. 11, the horizontal axis is the prediction label that comes from the trained 1D-CNN model and the vertical axis is the true label for the input data instance. The diagonal of the confusion matrix is prediction accuracy for each class. From the confusion matrix in Fig. 11, it can be inferred that the model can extract and identify the patterns corresponding to different feed rates along with identifying the axis in operation. This provides further evidence that the energy consumption data has valuable information that could be exploited to better understand the machine condition any time during its operation. The 1D-CNN model was trained for a total of 600 epochs and the hyper-parameters used for the model development were optimized using a Bayesian Optimization technique.

Fig. 9
figure 9

Training and validation loss of 1D-CNN

Fig. 10
figure 10

Training and validation accuracy of 1D-CNN

Fig. 11
figure 11

Confusion matrix for working component’s feed rates classification

From the results, the 1D-CNN model performs better compared to other types of models, both machine learning and deep learning, studied. The patterns from the power consumption data can be used to identify the working status of the machine as well as the feed rate of each axis.

Even though this study focuses on ultra-precision machining, the methods developed could be potentially applied to Conventional CNC machining. The requirement of this method is the ability to procure and interpret the g-codes to synchronize machine states with the input energy data. It is particularly valuable for ultra-precision machining because of the ability to identify and characterize imperceptible patterns within the energy data that can easily be overlooked.

6 Conclusions

The work presented in this paper primarily focuses on identifying the working status of the machine. In this study we developed a novel 1D-CNN architecture that can identify and classify the working components of the machine with an accuracy of 95.7%. The conclusions drawn from this study is given below:

  1. 1.

    In the case of energy monitoring of ultra-precision machines, machine learning and deep learning technologies are increasingly beneficial to identify and extract the patterns of interest to identify the equipment states.

  2. 2.

    In the processing of time series data, 1D-CNN performed better compared to the traditional machine learning approach. The 1D-CNN performed with a classification accuracy of 95.7%, whereas the best performing machine learning model achieved a classification accuracy of 56%. The 1D-CNN model also performed better than other model deep learning model architectures. The difference in the performance can be attributed to the difference between the feature extraction and feature learning process and the method of data processing involved in a 1D-CNN.

  3. 3.

    The 1D-CNN model developed as a part of this study was not only able to identify the different working components of the machine but was also able to identify and classify the variations in power consumption associated with equipment’s axis feed rates.

In the ability to detect the machine states the current study was limited to identifying a single state at any point in time. In general, any CNC machine could exist in a combination of states, being able to segregate and identify different machine states from the energy data is challenging. One potential solution is to generate data for different combination of states and use them in the model development process, but it is not a practical solution. Further work needs to be conducted to overcome this limitation.

The work conducted so far has set the stage for real-time equipment condition monitoring. The future work of this project will involve development of regression-based models to predict the feed rate from the energy consumed by the machine. This will be followed by the development of a real-time working status identification platform that can predict the working status of the machine in real-time and can interpret the g-code to predict the energy consumed by the machine. The final stage of this project will be anomaly detection, where an unsupervised model trained on data corresponding to the normal operation state of the machine will be used to detect the equipment operation anomalies.