Cutting tool condition monitoring using eigenfaces

Effective monitoring of the tool wear condition within a machining process can be very challenging. Depending on the sensors used, often only a part of the relevant wear information can be detected. In the case of milling processes data acquisition is made even more difficult by the fact that the process working point is inaccessible for sensor applications due to the physical tool, the machining process itself, the chipping and used cooling-lubricants. By using a variety of sensors and different measuring principles, sensor data fusion strategies can counteract this problem. An approach to this is the eigenface algorithm. This approach, a face recognition technique, is tested for its suitability on tool condition monitoring in milling processes by using multi-sensor process data.


Introduction
In the field of process monitoring many approaches exist to detect process stability, tool conditions and related variation of the workpiece quality. Their aim is to ensure the manufacturing reliability by choosing suitable machine settings and suitable tools, as well as to increase the tool life. A worn tool causes higher friction and a worse cutting performance than a new one. Under economic aspects, it is desired to detect and predict wear as it rises within a cutting process. The use of a monitoring system becomes necessary. In addition, process monitoring should be able to detect unstable states in the process control and thereby help to counteract this problem. One can distinguish between offline, atline, online and inline monitoring in general [1]. In the field of metal cutting machine tools it is more common to differ between online monitoring and offline (intermitting) monitoring like grab sampling. Different process monitoring systems, sensor principles and different signal processing strategies are used herein. A comprehensive overview of the technologies used in process monitoring until 2010 is provided by Teti et al. [2]. The review considers tool condition monitoring systems (TCM), process parameters and related sensor principles for measuring, as well as the procedure of data acquisition and processing.
However, the more specific subdomain of digital image processing has so far hardly been considered. Dutta et al. give an overview under the focus of the tool wear [3]. The review deals with optical images and processing strategies for detection of several kinds of tool wear. Some of the reviewed papers face the challenging task of getting data out of machine vision systems by using signal processing algorithms. An intermitting production step becomes necessary for light settings, illumination and the visual monitoring. Moreover, the computation of the images and the identification of wear correlated effects can take some time. An online monitoring can only be realized with restrictions. An example for semi-automated visual online monitoring is given by Karthik et al. [4]. They are using a stereoscopic camera system to measure the wear of turning inserts. The classification and evaluation of the obtained three-dimensional image data is less precise and incumbent on humans care. More advanced approaches of fully automated wear monitoring are given by Kassim et al. [5][6][7][8]. They are using and combining different approaches, such as multilayer perceptron neural network, canny filter for edge detection and the Hough transform [5] as well as run-length statistics, Mahalanobis distance classifier and the Hough transform [6] as well as Hidden Markov Models [7,8] for automated measurement of the wear of milling tools. Other approaches for tool wear monitoring were applicated by Datta et al. [9,10] using higher order statistics (grey level co-occurrence matrix) in turning, Tsai et al. [11], Dhanasekar et al. [12] and Niola et al. [13] using Fourier or wavelet transform and other techniques in shaping, turning, milling and grinding. All approaches have benefits and obstacles and all are confronted with the same problem of attached chips, lubricants and coolants, optical contaminated tools. Moreover, these techniques are not able to refer the process stability to the tool wear.
Besides optical instrumentation, MEMS sensors, piezoelectric accelerometers and linear variable displacement transducers (LVDT) are used to measure vibrations. Piezoelectric load cells and different kind of strain gauges are commonly used to measure process forces.
Dynamometers are favorable sensors for cutting force measurements due to their sensitivity and high reliability. Since the sensor is situated under the cutting zone, even small load changes can be detected. Additional process information can be tracked by pyrometers (temperature) and accomplished from chip form. Displacement measurements at the workpiece surface give also further information [2].
In addition, it can also be shown that the process forces measured on the workpiece side are directly related to the progress of tool wear. There is also a direct correlation between tool wear and tool deflection amplitudes. The cutting force generally increases with tool wear in milling processes [35]. A combined investigation of the measured variables was carried out by Möhring et al. [27]. They also found that by capturing vibrations at the tool tip, sensitive information about tool condition and tool chipping during the operation can be obtained. But not all of these previously mentioned methods are suitable for online measurements. A combined consideration of the different sensor applications and different signal processing methods results in a highly complex issue, the so-called curse of dimension. Depending on the considered machine itself and the machining process to be examined; it is often advisable to make limited sensor selections. The question for the selection of process correlated key features persists beyond sensitivity analysis. Sensor data fusion approaches attempt to address this question.
Sensor data fusion approaches can be classified under various aspects. Due to its multidisciplinary nature, it is difficult to make strict distinctions. Durrant-Whyte distinguished them by the relations between the input data sources into complementary, redundant and cooperative [13]. Dasarathy distinguished according to the input/output data types and their nature between data, features and decision systems [14]. Other classifications differentiate based on abstraction level of the employed data, on the data fusion levels defined by the Joint Directors of Laboratories/ Data Fusion Information Group (JDL/DFIG) and algorithm architecture types [15]. Moreover, under the topic of machine learning as well as in general, a broad range of statistical methods is used. There are many different applications in the field of production engineering. An example is given by Tiwari et al. [16]. They fused the texture feature of optical images of a workpiece with the data of the milling cutting force and by using a Kalman filter to build a model for the prediction of the milling tool flank wear. Chen and Jen [17] fed an artificial neural network with several features of the fused dynamometer-force signals from a CNC-table and acceleration data of a spindle housing to create an online monitoring system for tool wear monitoring in a milling process. Further applications were analysed by Duro et al. [18] using a multi-sensors data fusion framework for sensor validation depending on machining parameters, Cai et al. [19] using fusion methods to create a "digital twin" of a milling machine, Simeone et al. [20] using sensor fusion techniques for estimating the residual stress in turning, and many more. A review of applications of data fusion processing strategies focusing on modern connective and statistical methods, and advanced frequency analysis is given by Diez-Olivan [21]. Methods, primarily used for source separation and denoising of sensor channels, within process monitoring are the principal component analysis (PCA) and the independence component (ICA) analysis [22][23][24]. These algorithms find a widespread application in the field of face and object recognition under the name "eigenface algorithm" [25,26]. The application of the eigenface algorithm for sensor data fusion has not been investigated so far.

The eigenface algorithm
The eigenface algorithm is a method of the multivariate statistics. Herein, eigenfaces are a set of eigenvectors depending on a training set of images and a considered image. It was developed by Turk and Pentland in 1991 for the task of face recognition. The method is real-time capable and allows to identify and distinguish human faces out of a set of images. As image processing algorithm it works as well on other objects. For best classification results, the images need to fulfil several conditions in general. The state space representation of the considered images makes it distinguishable to other images.

Preliminaries
Digital image processing methods are based on the discretisation of images into pixels. Here, images are fundamentally represented as matrices. In it, each matrix element is represented as a number that encodes a color value. Likewise, measured values can also be arranged in a matrix by arranging the measurement vectors next to each other row by row or column by column. This is a parallel between visual images and measured values. With the approach of arranging data sets in a matrix and interpreting them as images, the data acquired during a process can be seen as a process image. Taking this reasoning further, the image resolution determines a finite number of rows and columns. In a process image, the number of columns is equal to the number of measuring channels. The number of lines, on the other hand, can be seen as the number of measurements per channel. The individual measurement values, i.e. the pixel values, determine their brightness. Since a 255 stands for the color black and a 0 for the color white, and all numbers in between represent gray scales, a normalization of the values to the range 0-255 must be carried out for the representation of measured values, which can also assume larger values, negative values or floating point numbers, in order to represent a gray scale image. However, if such matrices are to be processed by algorithms, a normalization step is usually not necessary. By arranging measurement data in matrices, to so-called process images, the access for (multi-channel) measurement data to image processing algorithms is given. This approach can also be used for single channel measurements. For better visual representation, the next step is mandatory. The measurement signal, which is a temporally encoded vector of sampled measurement information, must be segmented into equidistant sets -i.e. into partial measurements of equal length. The line-by-line rearrangement of these sets into a matrix then also produces a process image, comparable to a photograph. The temporal information determined by the sampling frequency of a measurement is encoded in the image depending on the segmentation. Such a process image can be processed using signal processing strategies, such as the eigenface algorithm. The described method is shown in Fig. 1 for measurement data from an accelerometer acquired during a milling process. The upper part of the figure also shows the changes in the tool cutting Fig. 1 Basic idea to adapt process data for image processing methods edges, before and after the machining sequence. These are used to determine the wear development.
On pictures of real world objects, like faces of photographies, many features-brightness, contrast, movements of shadows, changes in gesture and mimics, distance to the object, object orientation-can differ even between the same local settings within moments. On the one hand, this brings statistical variance and robustness for the eigenface algorithm, especially for the training data set. Nevertheless, on the other hand, for meaningful variations in sense of the algorithm, this can lead to misinterpretations. In order to avoid classification problems, images of faces should be taken under almost identical conditions. For measurement data, the requirements are similarly stringent. The position change of an object would correspond to a temporal shift of the measurement data. When considering reproducible CNC-controlled operations, this would be equivalent to a change in the process characteristic. This is a characteristic that one wants to capture specifically for use in process monitoring. A change in the illumination would be equivalent to an undesired sensor drift. However, under the same process conditions for machining processes, similar process images can always be expected with unimpaired sensors. It can be expected that the development of tool wear can be measured with the gradual progress. This can also be assumed for complex machining processes, as long as a basis for comparison of process images is given.

Basic concept
The eigenface algorithm will be applied to process data from a milling process. According to Turk and Pentland [25] the algorithm is structured as follows: In the first step, an averaged image Ψ is determined on the basis of M comparison images Γ i .
Here, all images are treated as column vectors by the sequence of the individual image columns. The process images that were initially viewed as descriptive images or matrices were only used for illustration purposes and are treated as vectors by the eigenface algorithm. The averaged image is used in the next step to determine the deviations of the individual images Φ i from this.
The individual deviations then serve to form the covariance matrix: Where matrix A is given as The covariance matrix C contains information about the compound probabilities of the statistical variables describing the process. Based on their eigenvalues and eigenvectors, an underlying process Ømega M can be characterized by M measurements describing it compactly. Because the direct calculation of the eigenvectors u i of the covariance matrix C is impracticable, even modern computer systems reach their limits quickly with regard to memory usage for measurements with more than a few measurement points. To determine the eigenvectors u i a substitution problem needs to be solved.
Where v i are the eigenvectors of C T and i are the scalar eigenvalues. By left multiplication with A and back substitution, for the determination of the original eigenvectors u i follows: The calculation of eigenvalues i is carried out by solving the eigenvalue problem.
The determination of the eigenvectors of the substitution problem works analogously: Thus, according to equation 6, the eigenvalues u i of the covariance matrix C can also be determined.
This determination method delivers the N(N <= M) biggest eigenvalues u i depending on the number of measurements used for the averaged image Ψ . This approach is feasible because the largest eigenvalues contain the most significant information of the covariance matrix C.
Each of the M process faces Φ i out of the training data set of the power Ømega M can be represented as a mean valuefree approximation Φ i by a linear combination of the N best eigenvectors: The eigenvectors of the covariance matrix u j are called eigenfaces. Each normalized process face Φ i used for training can be represented in its base Ømega i and their components i 1 , i 2 ... i N as: Moreover, the eigenvectors form the foundation for the projection of a new comparison image Γ * into the state space. For this a process image becomes normalized to Φ * according to Eq. 2 . The projection into the state space Φ * takes place according to Eq. 9, and is represented by its base Ømega * according to equation 10 as a point in the eigenplane (face space). The eigenvalues i indicate the variance along the i-th principal component [32,33].
An application of this method on process data can give a low dimensional representation of a whole cutting tool lifecycle from its sensor data, within a milling process for example. The face space representation of process images of the comparison database of an unworn tool (here, process image 1 and 2) and the images of the continuous tool wear progress (here, process image 3 and N) are shown in Fig. 2.
Here, O is the number of the blocks for rescaling the data to visual images.

Interpretation and application of the eigenface algorithm
The eigenface algorithm offers a statistical tool, which was developed for the recognition of objects and faces. It In the first part of the algorithm, the covariance matrix (the variances and covariances), i.e. a measure of the linear relationship between the image contents (pixels or data points), is calculated. In the second part, the covariance matrix is used to determine the largest variances (largest eigenvalues) between the data sets (comparison images and current image) along the principal component using algebraic operations. These values were projected into the eigenface space. Since the algorithm does not have to be trained in contrast to artificial neural networks or Bayesian networks, it does not evaluate any objective abstractions. Its significance is always given in relation to the basis of comparison, the reference images. When applied to process data and measurement channels combined to form images, this means that a statement is made about the common information in the data. If the measurement data are used as time signals, unlike traditional images, an additional time information is created. This is also expressed in the eigenvalues and the associated eigenvectors. However, this additional information is lost in amplitude frequency spectra. Due to the common information recorded via different sensor channels, the processing of the measurement data by the eigenface algorithm is to be regarded as a sensor data fusion. The projection of the process images takes place in an eigenplane, since the eigenface algorithm results in the eigenvectors for multiple eigenvalues. Since these are linearly independent, they span the linear subspace in the face space. However, this assumes that any processed image is part of the comparison set. A change in this set can in spite of overlapping contents result in projections to different eigenplanes. With similar process images, however, these are hardly inclined towards one another and form a manifold. This topological manifold can be projected back into a single eigenplane. In relation to an advancing process, the form of the manifold characterizes his non-linear dynamics.
One obtains a deeper insight by considering the eigenvalue problem in the eigenface algorithm and its interpretation. The eigenvalues and eigenvectors of the covariance matrix (dimension MxM) cannot be determined directly due to their size or the size M of the data sets needed for thei calculation. Therefore, the replacement problem C = AA T is solved in order to be able to determine the N largest eigenvalues and the corresponding eigenvalues according to the number of data sets N(N <= M) . The associated eigenvalues are orthogonal to each other by definition and are therefore linearly independent. An eigenvalue describes the length of the respective eigenvector. In addition, it can be shown that the eigenvalues of the eigenvalue problem: for the non-trivial solution ( [C − E] = 0 ) reflect the variance along the principal component v (eigenvector of covariance matrix C) [23]. The variance is an important signal feature in general. Since the largest eigenvalue 1 represents the largest variance (along the first principal component 1), it has a great importance for the practical data evaluation, here. The variance 1 changes due to the approach of using 2 constant process images and one process image that is changing with each new measurement. If the 3 process images are similar to one another (same tool condition), the variance is low. However, if the 3rd process image deviates significantly from the first and second process image (change in the tool condition that can be "seen" in the measurement data), the variance is large. It can be assumed that the variance correlates with wear mark width. Furthermore, it should be assumed that there is a correlation between the dynamic development of these features, too. The classical use of the eigenface algorithm suggests that the consideration of the respective process images in its base, i.e. the projection of process images on the principal components, can reveal further details about similarities and differences in the underlying processes, comparable to face tracking. [32,33] The eigenface algorithm can thus be used on several levels. On the one hand to represent process images in the face space and to make them comparable into it, on the other hand, the variance can be considered for the principal components, and be compared analogously to the state space representations of each process image. Furthermore, with these features, the dynamic courses can be compared over multiple measurements (multiple process images) along a series.
However, the linear approach of the eigenface algorithm also has restrictions. Nonlinear relations in process images can only be described vague. The consideration of the principal components expresses the nonlinear character in the sense of the variance. However, information about the type of nonlinearity can not be reproduced by the eigenface algorithm. The reference process images of the comparative base are also affected by this. Thus, the representations in Face Space only have a significance for linear changes. This also applies to consideration of the variance dynamics, even if their dynamic course can be non-linear. This only expresses how the process images in the linear sense change against each other. In addition, the use of the eigenface algorithm generates a common information that does not reproduce how many sensor channels and measurement points were included in a process image. Sensor-specific effects can only be detected by a separate consideration of the measuring channels. Furthermore, only measurements with a same number of measuring points taken with the same sensor arrangement and sampling rate are comparable to each other. An application of the algorithm thus requires a fixed measuring arrangement. A change of sensors and sensor locations can lead to restrictions in comparability. Furthermore, one can not detect from the signal features when a change/effect has occurred within a measurement (i.e., in a process image), without examining this in more detail. The eigenface algorithm thus has only a specific meaning, with respect to the measuring structure and the measuring interval. He is to be understood as a statistical instrument. Never the less, there are various options for creating a reference set of process images for the application of the eigenface algorithm. The data recorded about the tool life can only be used once the respective measurement sequence has been completed. The reason for this is that a process images must have a minimal size for its processing by the eigenface algorithm. Processing in batches thus limits the usability of the algorithm for an online monitoring system. Another option is to use a set of ongoing process windows. Here, however, it can be foreseen that changes occurring gradually do not lead to any noticeable change in variance between the process images. As shown in this work, it proves to be useful for the tool monitoring task to use both at the same time: measurement data (and the resulting process images) that reflect the unworn or little worn condition of the tool, and measurement data (and the resulting process images) that are recorded with the progressive wear of a tool. The use of process images, which reflect the new condition of a tool, provides a reference basis for comparison with other process images. Assuming that similar process conditions result in similar process images (i.e. measurement data), it is to be expected that the algorithmic processing will result in small variances (i.e. large similarities) between the process images and the basis being calculated. Under the further assumption that different process conditions (i.e. unworn versus worn) are expressed in different process images (i.e. measurement data), it is to be expected that the algorithmic processing will result in large variances (i.e. significant differences) between the process images and the basis being calculated. Furthermore, the practical application of the eigenface algorithm to process data should show that the variances compared to the basis (tool condition-new / as good as new) change gradually as the tool wear condition progresses. For this purpose, two data sets (process images) are used to form the reference basis, which show the retraction of the tool under consideration. This basis is retained for the algorithmic processing. In addition, for each additional measurement with which the degree of tool wear increases, a single, additional process image is used, which reflects the wear state, in order to form an average process image Ψ.
The overall approach must be considered more broadly. It consists of the merging of the measurement data to process images, and the application of the eigenface algorithm. The combination of measurement data of different sensor channels to process images does not only open the access to the eigenface algorithm, but to a whole class of image classification algorithms. Furthermore, the use of process images also qualifies the further processing of transformations of individual measurement series (i.e. sensor channels) into images, such as the amplitude-frequency spectral image. The eigenface algorithm forms the second part of the approach. Since this essentially performs principal component analysis (PCA), its application also results in its strengths and weaknesses. Among its advantages is that the myriad of signal features that a measurement series possesses are reduced to the dominant information. Correlation between the myriad of signal properties brings with it the curse of dimension. By the application of the PCA however only the dominant characteristics are correlated with each other. The computational effort is thus reduced. The resulting principal components result as linearly independent (uncorrelated) characteristics. Thus, they can be considered separately from each other. Another advantage is that high-dimensional data can be visualized and compared. Primarily the dominant properties are reproduced. For the investigation of connections to further sizes, a simple and descriptive access is given. On the other hand there are different disadvantages. The independent components in data sets are attenuated or lost by the application of the eigenface algorithm. I.e. if an effect can only be detected on one measurement channel, its influence on the dominant principal components is minimal. In this case important information can be lost. However, it can also be interpreted in such a way that if this effect is a signal disturbance, that it has hardly any negative effect on the central information. Only by standardizing the data can it be ensured that the actual dominant properties are fully represented. Another disadvantage is that by maximizing the variance and orthogonality of the principal components, properties can be lost in measurement data. If, for example, a data set is crescent shaped distributed in IR2 , only one cross (2 principal components) is projected into the data set, describing it in this way. While the variance maps further information about the data set, it does not provide information about what kind of the nonlinearity is given in the data. The eigenface algorithm in its basic form thus has a linear nature. [32,33] Compared to network-oriented and adaptive machine learning algorithms, the method is learning-free. Thus, it does not need to be trained and deterministically produces the same output for the same input. Unlike network-oriented approaches, its application is numerically equivariant. At the same time, it is analytically interpretable. The approach of process images can also be applied to the Independent Component Analysis (ICA). In contrast to PCA, the ICA does not compress data. Their goal is to separate information. The ICA extracts hidden factors within data by transforming a set of variables into a new set that is maximally independent. Further image processing approaches are in question for the processing of process images. However, these signal properties extract in increasingly complex manner, or output the signal properties in the unsteady form. Other image processing approaches can be considered for processing process images. However, these extract signal properties in an increasingly complex manner, or output the signal properties in a less constant form, i.e. as estimated values.

Numerical effort
In order to be able to guarantee real-time capability, it is necessary to determine the numerical effort of the algorithms computations. The time in which the algorithm can generate a result based on the measurement data depends on the processing speed and computing power of the signal processing computer used, but also on the used accuracy of the data acquisition.
In order to be able to record even the smallest changes in processes, a measurement resolution is necessary that does not include any quantization errors beyond the sensor accuracy. The measuring range must be completely covered. At the same time, the data storage costs must be kept low. As a compromise between performance and costs, the measuring systems are often used with accuracies of 12 to 16 bit. However, special analogue-to-digital converters (ADC) also achieve measurement resolutions of up to 32 bit. For the data processing of a single control channel, quantization of the analogue values using 16 bit ADCs results in gradations per time unit on the measuring scale. Summed up over the number of measuring channels and multiplied by the sampling rate, the result is the data stream to be processed by the signal processing algorithm. For example, for 10 sensor channels and a sampling rate of f = 10 kHz, this results in 1.6 Mbit per second (Mbps).
This incoming data stream needs to be processed. Since the individual steps of the eigenface algorithm can be paralleled, the processing frequency must at least correspond to the sampling rate. E.g. the middle process face (1) is formed while the mean deviations of the previous measurement (2) are determined, etc. Here the choice of the process window size is generally of essential importance, since this determines the amount of data that is in the main memory, respectively. The hard disk must be temporarily stored. The number of images for the basis of comparison is also important for the numerical costs, since the number of operations required increases with their size. In the 2 16 = 65536 case of an idealized, sufficiently large arithmetic unit, which processes all algorithm steps in parallel, 32 million floating point operations arise for a single run with a 16-bit comparison base of 3 process images and a window size of 10 seconds, 10 control channels and a sampling rate of 10 kHz for step (1) per second (flops) with a time offset of 0.3 ms, for the second step (2) the costs are 16 Mflops and 0.1 ms, for the third step (3) they are 48Mflops and 0.4 ms, etc. In this way the entire algorithm can be summed up to significantly less than 0.5 Gflop and less than 10 ms per process image. For real computer systems, however, the numerical costs continue to rise, since the memory management, i.e. the register access and the distribution of the arithmetic operations have to be distributed over a few cores. Since as of 2020 even CPU-integrated extremely low voltage graphics units have already achieved a computing power of 2 flops, clocks significantly higher than 10 kHz and are available for programming (GPGPU), the resulting data stream can be processed sequentially by them in real time. The resulting time offset amounts to a few hundred milliseconds. In the case of an idealized, sufficiently large arithmetic unit, which processes all algorithm steps in parallel, 32 million floating point operations arise for a single run with a 16-bit comparison base of 3 process images and a window size of 10 seconds, 10 control channels and a sampling rate of 10 kHz for step (1) per second (flops) with a time offset of 0.3 ms, for the second step (2) the costs are 16 Mflops and 0.1 ms, for the third step (3) they are 48 Mflops and 0.4 ms. etc. In this way, the entire algorithm can be summed up to significantly less than 0.5 Gflop and less than 10 ms per process image. For real computer systems, however, the numerical costs continue to rise, since the memory management, i.e. the register access and the distribution of the arithmetic operations have to be distributed over a few cores. Since as of 2020 even CPU-integrated extreme low voltage graphics units already achieve a computing power of over 2 Tflops at 32 bit accuracy and extreme low voltage CPUs itself over 1 Tflops at 32 bit accuracy. They are clocked significantly higher than 10 kHz and are available for programming (GPGPU). The resulting data stream can be processed sequentially by them in real time. The resulting time offset amounts to a few hundred milliseconds. The size of the data stream can vary significantly depending on the number of sensor channels, their bandwidth and the selected sampling rate. The computers used for signal processing and their programming should be adapted to the data rates so that a high real-time capability is guaranteed. Otherwise, the sampling rates must be reduced. Furthermore, the process data can be processed offline or intermittently in processing breaks. The use of data from the frequency amplitude spectra can significantly reduce the computing effort. On the one hand, the number of reliable data points in the frequency domain is halved, as validated by the Nyquist-Shannon sampling theorem. On the other hand, process-critical information can possibly lie in smaller frequency bands to which the data records can be restricted for the further processing by the eigenface algorithm. Due to the relatively low numerical costs, the learning-free algorithm is superior to the most machine learning methods, especially since it does not have to be time-consuming and computationally intensive to train on a particular process. In addition, it represents a strictly deterministic transformation. Accordingly, the same data entries always lead to the same representations of these representations in the face space.

Experimental setup and practical application
To validate the method for tool condition monitoring, line milling tests were carried out with a copy milling cutter and TiAlN-coated milling inserts in cold work steel (1.2379).
The milling tests were executed on a 5-axis milling machine of the type DMG HSC 55 linear. The milling tests were carried out under cutting conditions comparable to [27]. To measure process variables, a three-axis Kistler Type 9257A force measuring platform and two the tool displacement in the x-and y-direction measuring Micro-Epsilon eddyNCDT 3100 eddy current sensors were used (Fig. 3). A sampling rate of 30.3 kHz was selected for the data acquisition. In addition to the milling tests, the tool wear progress was recorded intermittently with a Keyence VHX5000 microscope. The development of the wear mark width was recorded over the entire tool life and for reference of the eigenface algorithm. For the application of the eigenface algorithm, the data of the five sensor channels (forces in x, y, z and tool displacements in x, y) for three milling paths each were combined to form a process image. The first three process images generated in this way are used to form the first reference point (in the face space) by using the eigenface algorithm. This point reflects the new condition of a tool. The first two process images are recurringly calculated together with the data of the other individual measurements (process images) independently of the reference point to form further representations (points in the face space). The representation of the measurement data in face space makes changes in the process data easier to understand. The next chapter covers this in more detail.
A National Instruments PXIe-8133 industrial PC and the LabView software, version 2017 SP1, were used to collect the sensor data. After acquisition, the data was ported into a more common TDMS format using the National Instruments DIAdem software and processed further using the Mathworks MATLAB R2016 software and the MATLAB Parallel Computing Toolbox to produce the data presented here.
Tool wear depends on the cutting materials, the tool geometry, the material of the workpiece, the cutting values and the underlying wear mechanisms. For the case examined, the wear mark width is used as a wear criterion. In the investigations carried out here, a tool is considered to be worn out as soon as the wear mark width exceeds 500 m . The line milling tests were carried out with largely identical cutting values: The cutting speed v c was 220 m/min, the feed speed v f 2380 mm/min, the cutting depth a p 1.2 mm and the cutting width a e 0.5 mm. Only the spindle speed varied between the tests. This was n 5300 rpm for the first tool and 5825 rpm for the second tool. The entering angle for both tools was 45 • . Copy milling cutters with TiAlNcoated carbide indexable inserts and a diameter of 12 mm were used as tools.
For each of the two milling tests, 3600 milling paths were marked out over a length of 100 mm each. A process image was created for each three milling paths for the algorithmic processing of the data recorded. For the following figures, there are 1198 representations in the eigenface space or 1198 representations for the consideration of the variance along the first principal component. No assignment is possible for the assignment of the first and second milling line, since these are used as the basis for the algorithmic processing. The data of the 5 sensor channels are combined in each process image. In accordance with the sampling rate

Results and discussion
The development of the wear mark width corresponds to the usual behavior for the dominant formation of flank wear [29], as shown in the following figures. After running in, the wear initially increases slowly over a wide range, after which it increases progressively towards the end of the tool's life. The application of the eigenface algorithm leads to a representation of the measured value sequences within the state space (Fig. 4).
The colour-coding of the displayed data corresponds to the chronological sequence of the data sets from blue to green to red. Blue dots thus correspond to process images that are created when using a tool that is as good as new or almost as good as new. Here, all points lie in an eigenplane. The blue points are close to the coordinate origin, which corresponds to the new state. Points corresponding to moderate tool life and the use of a moderately worn tool are green. These have a larger scatter and move further away from the center. Red dots, which correspond to long tool life respectively to progressive wear, also have a large scatter and are further away from the origin.
The tendential relationship becomes clearer if the Euclidean distance to the coordinate origin, as used for example by Mitchel [30], is compared with the wear mark width, or if the consideration is limited to the first, most meaningful eigenvalues of each representation. Figure 5 reflects this below. It is shown clearly that the variance increases with increasing wear. Furthermore, one can see that the variance in the area of the interrupted cut is visibly lower. Outliers with larger values also occur in these areas.  However, another way to show that the eigenface algorithm can be used for wear monitoring is to normalise each measuring channel to the respective largest measured value contained therein. The arrangement of the representations in the state space can now be seen even more clearly in Fig. 6. In particular, the representations of a short tool life, the blue dots, are now centrally concentrated around the origin. The moderate and long tool life states are now projected in the form of ellipses around them. It is also true that the more advanced the wear state is, the further out the points move. The same applies to the scattering of the representation with increased wear. The longer the tool life is, the larger is the scattering. It must also be mentioned that the areas and ellipses penetrate each other, or are slightly displaced from each other. This can be seen as an indication of a possible nonlinear relationship, which the eigenface algorithm in its simplest form cannot adequately map into the face space by its kernel [9]. It is possible to expand the view using other kernel functions. [28] Figure 7 also clearly shows that the variance along the first principal component, the first eigenvalue, is related to the increase in the wear rate. Compared to the observation of the non-normalised time signals, no major outliners are recognizable any more. Furthermore, it can be seen that the variance curve in the area of the interrupted section leads to a slight reduction in the calculated values. Compared to the variance profile of the non-normalized raw data over time resulting from the processing of the raw data over time, however, the interrupted cut through the normalization is expressed to a much lesser extent.
A further improvement is shown by the application of the approach to the data transformed into the frequency space   (Fig. 8). By normalising the amplitude frequency responses to the maxima occurring per measuring channel, even the cross over of the clamping screw holes by the milling tool during the line milling process (interrupted cut) near the machining lines 90, 190, 390, 480, 680 and 770 can be detected for moderate wear conditions. In general, the variance and the variance change during machining increase with the wear state. The variant curves thus develop in a similar way as can be expected from the underlying measured variables: process force and tool deflection.
It should be mentioned that the variance in the area of the interrupted cut increases slightly, in contrast to the processing of the time signals. This means that there is a greater deviation of the measured signals from the base (process images 1 and 2) due to the consideration on the frequency basis. This means that the interrupted cut reduces the dependence of these signals on each other. This makes it clear that the cutting and engagement conditions have a non-negligible influence on the assessability of wear development using the eigenface algorithm in the frequency domain.
The variance for the consideration of the normalised process images is smaller than for the consideration of the non-normalised data (Fig. 5), and as well as for the considerations in the frequency domain smaller than in the time domain. Furthermore, the channel-normalized and the frequency signals show clearer dependencies to the non-wear correlated cutting conditions, such as the interrupted cut. The inclusion of the NC code and machining simulations may provide further information.
All these findings were confirmed for a second tool under almost identical conditions. The eigenface algorithm was used for this in an identical way. The first two process images were generated separately and each tool-related. The Figs. 9 and 10 illustrate the findings for the second tool by showing the normalized time signals (Fig. 9) and the normalized  (Fig. 10). Furthermore, first investigations into drilling and tapping processes indicate that the application of the eigenface method leads to similar results.
When comparing the tools, however, the clear difference in the course of the wear curve should be emphasized. This is due to the fact that with milling path 841 a clear cutting edge breakout of the first tool occurred due to crater wear. In addition, the wear mechanisms were the same for both tools, dominated by flank wear.
To interpret the results, the structure of the measurement data and its effect on the eigenface algorithm and the formation of the covariance matrix within must be taken into account. When processing the non-standardized measurement data, the data of the individual measurement channels show significant differences in amplitude. This has the consequence that the elements of the covariance matrix have amounts of different sizes. Because if the product is formed from two comparatively weak measurement signals, this is significantly smaller than if the product is formed from two strong measurement signals. Therefore, the variance based on the time signals is also significantly higher. For the channel-normalized data, however, the elements of the covariance matrix approximate one another. As a result, the representations of the data records in the eigenface space deviate from one another. As mentioned, the channel-normalized data show a good relationship to the machining process. This applies to both the time domain data and the frequency domain data. The resulting differences between the time domain data and the frequency domain data must be seen in connection with the information they contain. In the time domain, information about the sequential sequence of process events plays a central role. In the frequency range, however, this information is largely lost. Singular events are averaged over the measurement window and are only partially expressed through their contribution to the amplitude of the frequencies concerned. Changes in the process and its temporal sequence as well as disruptions in the process and its temporal sequence as well as sensor drifts can significantly influence the results of the algorithmic processing of time domain data.
On closer examination, however, it turns out that the wear curves and the variance courses of the non-standardized sensor channels correlate better with one another than the wear curves with the variance courses of the standardized sensor channels. This is expressed through the determination of the correlation coefficient. In the case of the cross-correlation between the respective variance curves and the wear curves, the correlation maxima are in all cases at a lag of 0. This means that the algorithm ensures the largest possible correspondence between these variables. For the non-normalized time signals with correlation coefficients r of r = 0.779 (first tool) and r = 0.891 (validation tool), this is above r = 0.7 and thus, according to [31], shows a strong relationship between the data. For normalized time signals we get values of r = 0.898 and r = 0.907. The correspondences are thus also clear, with only a very small improvement due to the standardization for the validation tool. For the frequency signals there are values of r = 0.810 and r = 0.817 for the data of the non-standardized sensor channels and values of r = 0.760 and r = 0.778 for the standardized data. For the application of the algorithm to frequency data there is even a slight deterioration in the correlation due to the normalization. In addition, the time domain data prove to be more suitable for tool wear monitoring.
In further investigations, other pre-filterings and transformations should be used than those used here in order to find better correlations for wear development, and to eliminate outliers in the data for a qualitatively better correlation statement. For pre-processing, the raw data can be transformed into, for example, amplitude frequency, signal entropy, wavelet spectra or power density spectra and assembled into process images before being processed by the eigenface Fig. 10 Tool for validation-Comparison of the width of the wear mark with the variance along the first principal component, based on normalised amplitude frequency response signals algorithm. Furthermore, it needs to be examined to what extent the window size or the size of the process images influence the results. This is particularly important when significant disturbances occur in the time signals.
Without having examined these aspects, a statement about the tool wear in the face space or based on the variance along the first principal component is only possible to a limited extent. However, the comparison between the variance and the shown wear profiles suggests that significant tool wear conditions are associated with an increase and larger scatter of the variance profiles. With the help of the increase in the variance curves over the tool life and the increase in its degree of spread, a wear criterion can possibly be derived that can be carried out in real time and without reference measurements. In subsequent work, it is necessary to examine the tool wear profiles of other tools for validation.
The fact that cross-correlation / Pearson correlation is just a measure of the linear dependence between variables under consideration must also be taken into account. Nonlinear behaviour can lead to misinterpretations due to this key figure.
Since the resulting curve progressions can only be roughly approximated linearly, a further comparison criterion should be used in addition to the linear correlation measure, i.e. the Perason correlation. The distance correlation represents a similar measure for this, but one that qualitatively takes into account the nonlinearity of the curve progressions. In order to obtain a statement on the correlation between the measurement data and the wear mark width, taking into account the nonlinearities in the variance curves, the data processed with the eigenface algorithm (raw data and normalized time signals as well as the normalized and non-normalized frequency signals) are considered again. The results of the distance correlation between the measured wear mark widths and the algorithmically calculated variances are shown for tool number 1 in Table 1 and for tool number 2 in Table 2. The results are given for the direct determination as well as for the determination of from over 50 values smoothed variance values. The correlation values that were created on the basis of the smoothed data series are marked with *.
The comparison of the correlation coefficients recorded in the tables shows the strength of the non-linear relationship. Overall, the distance correlation coefficient shows a poorer or an equivalent relationship between linear and non-linear dependency on the time domain data. In particular for the non-normalized time domain data, the relationship for the first tool is given by a weaker non-linear relationship in comparison to the linear correlation. The distance correlation is here r = 0.454. In the frequency range, however, the distance correlation shows consistently better relationships. The additional signal smoothing prevents the frequent fluctuations in the variance curves from flowing excessively into the correlation. The use of the smoothed signals generally shows very strong relationships for all domains with correlation coefficients of more than 0.85. In the frequency range in particular, there is therefore a very strong non-linear relationship for both tools.
The two measures used are common for making initial statements about correlations. When evaluating the correlation figures, however, it should be noted that both Pearson's correlation and the distance correlation are not robust to outliers. In particular, if outliers are clustered in one direction, the results may be biased. Nonetheless, the distance correlation provides a more reliable measure because it can qualify nonlinear correlations compared to the Pearson correlation. Considering the lack of robustness to outliers, the evaluations of the normalized measurement data can be considered more reliable compared to the nonnormalized measurement data, since there are fewer strong outliers in the latter. For the same reason, the evaluation of the data on frequency basis is more reliable than that of the data on time basis. There is no universal evaluation criterion that is robust against outliers. However, other correlation approaches besides signal filtering have been the subject of research [34].
Nevertheless, the consideration of the Pearson correlation and the distance correlation should suffice at this point as a qualifying approximation for the existing correlations.

Further investigations
The results from the processing of the process data by the eigenface algorithm are characterized by characteristic properties. The averaged image Ψ influences all results.
Since this is only formed by two data records (process images) initially recorded at the beginning of processing and by a continuous process image, initial influences have a significant effect on the results. In spite of the almost identical initial process images, their selection can result in characteristic deviations. In order to limit this, Ψ should be formed on the largest possible basis of good processes. Characteristic effects such as disturbances would thus be weighted less heavily. The noise appearing in the process images of the base would therefore also have less of an impact on the position (deviation) of the representation of the current process in the face space. This must also be compared with the application of the algorithm to the signals, which are usually denoised by low-pass filtering, in order to be able to recognize possible high-frequency signal characteristics within the measurement noise. The exact effect of the size of the comparison base set and the influence of its properties must be examined in subsequent work.
In addition, there is the option of methodically expanding the eigenface algorithm using kernel functions. It must be examined whether the projections that result from different kernel approaches in the face space show better correlations with wear. More fundamentally, benchmarking based on various wear correlations such as the RMS value and statistical considerations is necessary in order to better understand the approach and, if necessary, to expand it.
Furthermore, it has to be determined to what extent the eigenface algorithm can be applied to complex machining geometries. It must be assumed that application is possible with restrictions. It can be foreseen that the observation horizon, i.e. the size of the time window must be related to a complete machining process. Several good parts must therefore be produced in order to make a data comparison basis available. Through the subdivision into several sequences, i.e., process changes should, however, be recognized more quickly in smaller time frames. An examination of the influence of the window size must also be carried out for simple processing sequences.

Conclusion
It could be shown that the eigenface algorithm can be successfully used to fuse and map the tool wear condition from multisensory process data of a cutting process. The high-dimensional data used for this purpose could thus be reduced in dimension and reduced to a few signal features, the principal components and variances along the principal components. This also successfully demonstrated that the fusion of measurement data into process images, is suitable for processing process data by the eigenface algorithm. The consideration of the normalised time domain sensor data proved to be more suitable than in the non-normalised domain or in the frequency domain in general. In contrast, the use of the channel-normalized sensor data in the frequency domain was shown to be clearer and more meaningful for changes in the cutting conditions that were not caused by tool wear. It can be foreseen that the determination of the largest eigenvalues, the variance along the first principal component, is sufficient for approximate wear referencing. This is particularly evidenced by the strong non-linear dependence between the frequency domain data and principal component course.
Due to the fact that the approach of fusing sensor data and interpreting them as images makes it possible to access a large number of image processing algorithms, this is to be regarded as significant. By using the eigenface algorithm, a learning-free real-time capable tool wear monitoring system can be created, because of its low computational cost, which under comparable process conditions only requires a temporal start window to form a database of comparable work steps. Subsequently, process data can be referenced sequentially or continuously at the base. As idealized NC paths were created for the investigations, but there is more diversity in the machining of complex components with regard to directional and load changes as well as input, further investigations are required. It must be assumed that complex machining operations have no or only a few comparable machining sequences within a machining process. Therefore, the use of the method under consideration may only make sense for repetitive manufacturing. It has to be expected that an extension by kernel methods and the use of independent component analysis (ICA) method will be able to map nonlinear relationships appropriately. Further investigations should also include the consideration of process stability in order to correlate statements about critical process conditions with the method.
Funding Open Access funding enabled and organized by Projekt DEAL.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.