1 Introduction

Industrial artificial intelligence (AI) and machine learning can impact nearly every aspect of a manufacturing business, from design to production to maintenance. Among the entire spectrum of product manufacturing, machine learning models have been applied to real-time process monitoring [1,2,3,4,5]. Real-time validation of process performance is a critical factor in improving a traditional manufacturing unit that can not only meet the manufacturing requirements of products but also improve the smart factory efficiency in a self-organized way.

Despite laudable advances in AI programs of many manufacturing industries [6, 7], implementation of such technologies has been limited within aircraft manufacturing due to the far lower product volumes involved, high complexity, and the frequent need for very high levels of precision [8]. This slow implementation of automation/robotics technologies to aerospace composite manufacturing is also apparently true [9]. Molded fiber reinforced plastic (FRP) composite constituents need to be bolted together; therefore, the generation of fastening holes is one of the most frequent processes in FRP manufacturing [10,11,12,13]. Drilling of FRPs is a complex process, which is far different from that of monolithic aerospace metals such as aluminum or stainless steel. Chip formation of FRPs during the conventional drilling processes relates to a greater number of variables such as fiber type, matrix type, fiber orientation at the point of contact, composite part thickness, matrix hardness and heat sensitivity, cutting tool geometry, lubrication conditions, etc. [14,15,16,17]. A number of studies have been conducted to correlate the FRP machining input parameters such as cutting speed, feed, tool geometry, and lubrication condition and the output quality of machined composite holes such as surface roughness, hole size, and delamination [13, 18,19,20,21]. Efforts have also been made to model drilling forces and their relations to the hole quality outputs; however, many of these studies did not use in-process data, which are required for real-time monitoring and validation.

Applying machine learning techniques to manufacturing processes of metallic materials has been more extensively studied than in FRPs. A self-organizing map or self-organizing feature map is a type of artificial neural network that applies competitive learning and uses unsupervised learning. These maps use multidimensional scaling to provide a low dimensional representation of high dimensional data [22, 23]. Random forest algorithms use trained data to evaluate features and predict accordingly [4]. Bayesian Networks have been used by Bustillo and Correa [24] to predict surface roughness in drilled steel components. The random forest algorithms are used to monitor variables and used to predict manufacturing defects of the welded tube [25]. These algorithms were also used to predict the quality of drilled and reamed bores in cast iron [26]. Extreme Learning machines have feed-forward neural networks to pass the input nodes through hidden nodes until proper outputs are produced. Then they can be used to accurately predict future results. Mustafa [27] used this method to predict the quality of the laser-drilled holes in Ti-6Al-4 V.

Within FRP composite manufacturing, there has been some exploration of using AI methods to predict salient out-of-process measurements. Fuzzy logic has been used to look at the relationship of spindle speed, feed rate, and drill diameter to surface roughness. Fuzzy logic was applied to parameters with labels of severity, low, medium, or high. Comparing the parameters to the surface roughness not only individually but in combination revealed both spindle speed and drill diameter that have the most significant impact on surface roughness [28]. Kim and Ramulu used autoregressive coefficients to discriminate the cutting signals of autoclaved and induction heat-processed graphite/PIXA-M composites during the drilling process [29].

However, there are currently far fewer research papers applying machine learning, as opposed to more general AI methods, to the manufacturing of FRP composite materials. Caggiano et al. [29] applied artificial neural networks when drilling into FRP materials to evaluate tool wear. The inputs were thrust force and torque that were being monitored through sensor signals. It was found that by applying fractal analysis to machine learning, a more accurate diagnosis of tool life was obtained as opposed to conventional statistical methods [30]. Teti et al. [31] also applied an artificial neural network based machine-learning paradigm to monitor tool wear when drilling CFRP/CFRP laminate stack. They used multiple signal processing techniques to extract diverse features, including time domain, frequency domain, and fractal analysis signal features, from the in-process thrust and torque signals. Through artificial neural network (ANN) training with 60 input–output vectors, the fractal analysis and the time domain feature pattern vectors provided the best ANN performance in the classification of tool wear level.

In this study, well-established machine learning algorithms are applied to the drilling process of quasi-isotropic carbon fiber reinforced plastics (CFRP). Our work brings together unsupervised and supervised machine learning techniques and applies them to high fidelity data captured from a computer numerical control (CNC) machine tool, resembling aircraft production. This study aims to: (1) explore how dimensionality reduction and unsupervised learning techniques may allow in-process measurements to map onto critical hole quality measurements usually made in post-production; and (2) explore how supervised learning methods can predict unusually low hole quality from in-process data. The underlying assumption in our method is: artifacts (e.g., local material variation, unusual machine vibration, or tool defect) that give rise to abnormal or low-quality holes will also produce some visible signature in the data collected via in-process measurements. If true, this hypothesis suggests that once we identify “normal” operating signatures for a particular drilling setup, we can identify anomalous holes from the outliers. The primary contribution of this study is to demonstrate that well-established machine learning techniques can make useful predictions about out-of-process measurements in the CFRP manufacturing domain using only in-process data easily obtained during manufacturing. The secondary contributions of this paper are jointly to compare the performance of multiple machine learning methods and demonstrate how prediction performance changes when data is viewed in the time domain or, conversely, in the frequency domain. Thus, this study will contribute to developing applied machine learning systems to monitor, validate, and predict the conventional machining processes of CFRP in the aircraft product/structure manufacturing environment.

The paper is organized as follows. Section 2 describes the experimental setup for the drilling process and how raw data is processed to prepare for machine learning. Section 3 describes the machine learning methods and the workflow we employ, specifically Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), and k-Nearest Neighbors (kNN). Section 4 describes the out-of-process hole quality measurements and assessment results. In Sect. 5, we examine unsupervised dimensionality reduction using PCA on the in-process data. Our goal is to determine if data tends to form naturally distinct clusters and whether these clusters correspond to out-of-process variables of interest (e.g., diameter, roundness, or delamination length), addressing the first aim of our work described above. We examine the data in both the time and frequency domains and highlight differences. Critically, with the unsupervised methods in Sect. 5, all knowledge of the out-of-process variables are withheld from the learning algorithms themselves and are only used post-hoc to validate whether relationships that have been uncovered during learning relate to variables of interest. In Sect. 6, we bridge the gap between unsupervised machine learning methods, which simply explore the relationship of individual datapoints to each other and supervised machine learning methods that use out-of-process measurements during learning. Here, we employ LDA using measurements on our out-of-process variables. The aim is to determine if this additional knowledge provides qualitatively better clustering than PCA. In Sect. 7, we employ the relationships learned from the data to predict out-of-process measurements. We find that PCA of frequency-domain features, obtained using the fast Fourier transform (FFT), in cooperation with kNN performs best for prediction. Finally, in Sect. 8, we summarize our conclusions.

2 Experimental Setup and Feature Engineering

2.1 Drilling Setup

The workpiece used for this study is a 3 mm cross-ply CFRP laminate (Mitsubishi, Japan) consisting of T300/3 K carbon fibers (Toray, Japan) and epoxy resin with a fiber volume content of 56%.

As shown in Fig. 1a, a 3-axis CNC mill (M643, CSCAM, South Korea) implemented with a hall sensor (LA-25P, LEM, South Korea) for the current measurement was used for the drilling experiments. A dynamometer (9272, Kistler, Switzerland) was fixed to the CNC mill and mechanically fastened in a jig with a mid-column to support a CFRP coupon. The CFRP coupon is fastened to the jig at each of the corners. Three of the 3-axis accelerometers (356A32, PCB Piezotronics, Germany) adhered to three locations, including the dynamometer, the jig, and the CFRP coupon in Fig. 1b.

Fig. 1
figure 1

Drilling experimental setup

A fresh physical vapor deposit (PVD) diamond-coated tungsten carbide drill bit (CoroDrill 854, Sandvik, US) with 6 mm diameter was used for each jig setup. During the drilling experiments, three types of data sets are recorded in the data acquisition system (CDAQ-9174, National Instruments, USA). They include the current data from the hall sensor in the CNC machine, the z-axis thrust force data from the dynamometer, and the vibration signal data from the 3-axis accelerometers, as shown in Fig. 1f. The drilling condition was the dry condition (no coolant or lubrication) at the consistent spindle speed of 6500 RPM and 0.05 mm/rev. The drilling condition chosen for the study is within the range of CFRP drilling conditions published in past studies [21, 32, 33] and similar to CFRP aircraft final assembly practices [12, 34]. Figure 1e shows the coupon with eighty holes after the drilling experiment. Eighty holes were made on the coupon without any interruptions. Table 1 shows a summary of the signals collected, location, and hardware.

Table 1 Summary of the raw signals recorded from the sensors

2.2 Feature Engineering

The drilling experiments produced a series of collated raw digital signals that are directly measurable by the sensors integrated into the drilling setup shown in Fig. 1. Although all the measurement instruments in Fig. 1 were validated via calibrations, the random error or the unpredictable variations in the measured raw signals were not artificially filtered after the experiments. We aimed to exclude signal post-processing in the methods to test the usability of the developed anomaly detection method in the production environment with as little manipulation of the raw signal as possible.

During the experiments, we obtained the raw digital signals or collated in-process data, summarized in Table 1, continuously recorded when drilling eighty holes; therefore, the in-process data contain the signals between the drilling processes as well as during the drilling process (See Fig. 2a). We conducted feature engineering to extract features from the raw data to improve the performance of machine learning algorithms. Our process of feature engineering begins by examining in-process data (i.e., current, force, and vibration) to determine when the drill cutting lips are fully engaged in the CFRP coupon (the highlighted region in Fig. 2b). From this region of data, we extracted “high-level features.” In this study, the high-level features include twenty-two values from each hole: average and minimum current; average and maximum force; and average and maximum vibration for each axis (x, y, and z) from three locations. In addition to the high-level features, we obtained the frequency-domain sensor data for each hole by computing a fast Fourier transform (FFT) of the data from each vibration sensor during the full engagement of the drill cutting lips. As a result, we had a total of approximately 1,800 FFT features from each of the nine sensor-axes or less than 16,000 FFT features in total.

Fig. 2
figure 2

Feature engineering process to obtain both time-domain and frequency-domain data

2.3 Out-of-Process Hole Quality Measurement

After the drilling experiments, all the CFRP holes were investigated to obtain quality parameters, including hole diameter, roundness, surface roughness parameters (average surface roughness or Ra), and maximum delamination length and uncut fiber area of hole entry and exit. Table 2 introduces the instruments used for CFRP hole quality measurement and the measurement result examples.

Table 2 Summary of hole quality measurement methods, instruments, and examples

3 Machine Learning Methods

3.1 Machine Learning Pipeline

The machine learning pipeline begins with features collected directly during the drilling process. Initially, this means raw data collected from the machining setup and subsequently segmented into data for individual holes for which we calculate high-level aggregate values such as mean and maximum force as well as vibration FFT as described in Sect. 2. At this stage, represented by the left two boxes in Fig. 3 below, we are left with data from eighty individual holes drilled in the 3 mm material. The first two holes exhibit significant deviations across many signals as compared to the remaining seventy-eight holes. This can be due to the settlement of experimental setups, such as the coupon settling in the jig. Because the scale of these deviations is immense compared to differences between the remaining seventy-eight holes, we drop holes 1 and 2 from the dataset at this stage, and all results from subsequent stages are reported using the data from holes 3–80.

Fig. 3
figure 3

Processing stages for machine learning

The workflow then proceeds toward two experiments. For the first set of experiments (Fig. 3, right-most top box), we seek qualitative validation in which we manually examine: (a) whether the data easily separates into inliers and outliers; and (b) whether inliers/outliers identified with in-process measurements correlate with inliers/outliers with respect to out-of-process variables of interest. These experiments are described in Sects. 5 and 6. The second set of experiments, described in Sect. 7 and illustrated in Fig. 3 (right-most bottom box), aims to use the same basic processing pipeline and the k-Nearest Neighbors (kNN) algorithm to predict the values of out-of-process variables, which showed reasonable qualitative validation in the prior experiments.

Between feature engineering and the experiments, we perform an optional, but typically useful, step of dimensionality reduction. Coming out of the feature engineering stage, we are left with twenty-two high-level aggregate features (means, maximums, etc.) and thousands of FFT features from the three vibration sensors. Given our experimental endpoints described above, working with thousands of features can be computationally intensive and can also give rise to unsatisfactory results [35]. Thus, we employ two versions of dimensionality reductio n to bring the number of features down from hundreds per drilled hole to ten or fewer features per drilled hole. We examine the use of both Principal Component Analysis (PCA) and Latent Discrimination Analysis (LDA) in this context.

In the following subsections, we describe the background for each of the machine learning techniques applied in our workflow: PCA and LDA for dimensionality reduction; and kNN for predicting out-of-process measurements using only in-process measurements from newly drilled holes.

3.2 Principal Component Analysis or PCA (Unsupervised Dimensionality Reduction)

Principal component analysis (PCA) seeks to find a reduced dimensionality space in which to represent the original dataset while retaining the most information in the data [36]. This method has been widely used in machine learning and has shown value in transforming high-level features similar to those described in Sect. 2.2 into machine learning inputs to predict tool wear in CRFP drilling [37]. PCA is unsupervised in that it only uses in-process data to perform its function. It has no knowledge of the out-of-process variables (e.g., diameter, roughness, roundness) that we may be interested in predicting. Thus, PCA attempts to retain information in the data by maximizing variance in the reduced dimensionality space.

Generally, PCA is used for dimensionality reduction, and so only the first few principal components are used to represent the space. This means that in most non-trivial settings such as this one, the reduced dimensional space provides an approximation of the original coordinate. In Sect. 5, we use PCA to reduce the dimensionality of the data for visualization and for prediction. Prior to performing dimensionality reduction, in-process training data is normalized to ensure all signals have similar ranges. This helps guarantee that differences between units of measurement do not substantively influence the choice of the lower dimensional space.

3.3 Latent Discriminant Analysis or LDA (Supervised Dimensionality Reduction)

Similar to PCA, latent discriminant analysis (LDA) also seeks to reduce the dimensionality of the data. In contrast to PCA, LDA is a supervised method meaning that it requires knowledge of the out-of-process target variable. Further, LDA expects the target variable to be categorical, as opposed to numeric. LDA then finds the projection, which maximizes discrimination between clusters of data from each category. Here, the target class refers to one of three subsets discretized into ‘oversized’ ‘average,’ and ‘undersized’ categories in each out-of-process measurement. For example, from our measurements, we found the average hole diameter to be 6.0168 mm with a standard deviation of 0.0028 mm from the eighty holes. We separated these holes into three subsets: oversized holes with a diameter greater than or equal to 6.019 mm, average holes with diameters 6.015–6.019 mm (approximately 70%), and undersized holes with a diameter less than or equal to 6.015 mm. Given these three classifications, entitled undersized, average, and oversized, LDA models the distribution of the data given the class as a multivariate Gaussian and finds the two basis vectors that maximize the discrimination between the classes. This is in contrast to PCA, which has no knowledge of the class k when dimensionality reduction takes place, and so, can only seek to find projections that maximize variance in the data, which are irrespective of actual class boundaries.

3.4 K-Nearest Neighbors or kNN (Supervised learning)

Sections 3.2 and 3.3 describe two methods for performing dimensionality reduction on the data to help with visualization and reduce the feature space from hundreds of FFT features into tens of features. These methods can help us determine, qualitatively, whether the in-process measurements easily help separate data in a manner that corresponds to an out-of-process measurement we are interested in. For example, in Sects. 5 and 6, we will show figures indicating how our in-process measurements map onto different out-of-process variables. Looking at the figures, we can qualitatively judge whether the in-process measurements easily separate the out-of-process measurements into cohesive groups or clusters. To take the analysis further, we determine “quantitatively” how the in-process measurements can be used to predict out-of-process measurements. In this paper, we do this using cross validation in conjunction with the k-nearest neighbors algorithm to predict out-of-process measurements and analyze the accuracy of the results.

The k-Nearest Neighbors algorithm aims to predict unseen data by using similar entries in a catalog of known example cases. In this paper, we use data from the k-neighbors in two ways: if the out-of-process measurement is categorical, the k-neighbors vote; if the out-of-process measurement is numeric, the k-neighbors each provide input into a weighted average. Further details and experimental results are described in Sect. 7.

We employ cross validation in our experiments to fully leverage our available data. Cross validation begins by partitioning the dataset into “folds,” where each fold will contain data to train the model and validation data to evaluate the model’s predictions. Our experiments use leave-one-out cross validation (LOOCV), which takes all n holes from the training data (in our experiments n = 78) and create two groupings. In the first group, the training set, there are n-1 holes that are used to train the model. The remaining 1 hole is considered the testing set (or testing instance). The model trained on the n-1 holes is then used to predict the out-of-process measurement for the 1 hole that was left out of the training set. This process is repeated n times so that each hole is used for prediction one time. Performance results from the n predictions are then reported.

4 CFRP Hole Quality Assessment Results

Recall that our primary aim is to predict out-of-process measurements. Figure 4 presents the out-of-process hole quality measurement data for the machined CFRP holes. As mentioned in Sect. 3.1, we excluded the first two holes’ features due to their significant deviations from the remaining seventy-eight holes. However, the hole quality parameters of holes 1 and 2 are not overly outliers when compared to those of holes 3 to 80. Hole diameter ranges from 6.009 to 6.021 mm with an average of 6.017 mm. 74% of the holes possessed the hole diameter from 6.015 to 6.019 mm. Only six holes (holes 13, 20, 23, 25, 26, and 35) were identified as relatively oversized holes with a diameter of 6.020 mm or higher. A set of four holes (holes 31, 32, 34, and 36) was relatively undersized with a diameter of less than 0.6010 mm. The same set of four holes have relatively large roundness values, which exceed 0.025 mm. Approximately 92% out of eighty holes have average Ra values lower than 0.5 µm. Figures 5 present the machined CFRP hole surface profiles from holes 4, 29, and 79, representing the seventy-four hole group with an average Ra of less than 0.5 µm. In this group of holes, there is no sign of noticeable fiber pullouts, which occur when the carbon fiber bundles of 135º from the cutting direction are pulled away due to fiber bending, fiber-matrix debonding, and matrix stripping [14]. This indicates that the PVD diamond-coated drill could cut carbon fibers effectively, not causing deep fiber pullouts on the machined surface. Out of eighty holes, six holes’ average Ra values exceed 0.6 µm, but less than 1.0 µm, and they are holes 28, 37, 44, 56, 58, and 66. Figure 5c presents hole 44’s surface profile to contain significant fiber pullouts in the depth of approximately 6 µm and protruded fibers on the machined hole surface, which increased the average Ra value. Such noticeable fiber pullouts were observed in the group of six holes 26, 36, 44, 56, 58, and 66, and the occurrence of deep fiber pullouts appears to be random during the eighty-hole drilling process.

Fig. 4
figure 4

Out-of-process hole quality measurements vs. hole number

Fig. 5
figure 5

Surface roughness profiles of four holes

Maximum delamination lengths of both hole entry and exit were measured. It is noted that the distance between two holes was 4 mm; therefore, the largest possible maximum delamination length was 4 mm when the first or last ply of CFRP laminates of one hole peered off to the next hole. The mean maximum entry delamination length is 913 µm with a standard deviation of 552 µm. Twelve holes (holes 22, 24, 30, 31, 34, 36, 41, 43, 49, 64, 79, and 80) exceed 2 mm in maximum entry delamination length, and they are randomly distributed across the eighty holes drilled. Both the mean and the standard deviation of maximum exit delamination length are lower than those of entry delamination at 426 µm and 488 µm, respectively. Only five holes (holes 17, 27, 36, 45, and 64) resulted in the maximum exit delamination exceeding 2 mm. There were holes with the incomplete removal of first and last plies on holes, although no uncut fibers were observed in 44 holes in the entry and 54 holes in the exit. Holes 2, 11, 20, 31, 42 had larger than 0.5 mm2 of uncut fibers in the hole entry, while holes 39 and 64 had the hole exit uncut fiber area exceeding 0.5 mm2. Conclusively, we failed to define a set of abnormal holes possessing relatively poor quality across all out-of-process measurements. Most holes possessed minimal defects due to the better machinability of the PVD diamond-coated carbide drill when machining only up to eighty CFRP holes. This also confirms the complex nature of the CFRP drilling process, which is generally viewed to randomly produce holes with defects in a wide range and variation.

5 Anomaly Detection Results via PCA (An Unsupervised Method)

PCA, as described in Sect. 3.2, is an unsupervised method in that no out-of-process data is used for training. That is, PCA has no knowledge that we are interested in differentiating with respect to hole diameter, delamination, or roughness. PCA is simply trying to explain variance in the data overall.

Figure 6 shows PCA applied to the twenty-two high-level features and the approximately 1,800-FFT features. Figure 6a shows the data projected onto two dimensions (the first two principal components). Here, there are no apparent outliers. Points are generally well distributed across the space plotted, although the region to the lower right may contain four outlying points (approximate coordinates (−0.5,−4.5), (4,−3.5), (5.2, −2), (6.5, −1)), and one point in the upper center (approximate coordinates (1.5,6)). However, variance amongst the data points does not help to clearly distinguish any of these points as outliers. Figure 3b, in contrast, shows more distinct clustering. The primary group of points lies roughly on the bottom half of the figure sweeping down and to the right from approximately (−25, 8) down and to the right to (30, −12). A secondary group lies in the upper right quadrant of the graph.

Fig. 6
figure 6

Unsupervised dimensionality reduction results

The next step, is to examine how out-of-process measurements map onto data projections created by PCA. Figure 7 shows how PCA applied to the twenty-two high-level features (the data from Fig. 3a) maps onto each of the seven out-of-process variables of interest (diameter, roundness, surface roughness, entry/exit delamination and entry/exit uncut fiber area). None of these plots contain a simple boundary to distinguish between relatively above average or below average outliers. Indeed, orange points (high value measurements) are distributed across the range in most of the plots. Discrimination is further hampered due to the extremely high level of consistency during machining—there is very little variance in the out-of-process variables to scrutinize. The sole exception to this is the plot of diameter (Fig. 7a), where we can see some grouping of orange points toward the center of the graph. While there is no clear boundary to distinguish oversized holes from others, there does seem to be some correlation between the high level features and hole diameter.

Fig. 7
figure 7

PCA of twenty-two high-level features and relation to out-of-process measurements

Figures 8 show how PCA of vibration FFT measurements (data from Fig. 3b) maps onto the same out-of-process variables. We can clearly observe that the best fit occurs with the hole diameter data (Fig. 8a) and that, generally, holes with larger diameter (orange in the figure) occur in the cluster toward the upper right of the plot, while medium and smaller diameter holes tend to occur in the primary cluster (in the lower half of the plot sweeping down and to the right). Other out-of-process variables show little relationship between the out-of-process measured value and the clusters formed by PCA.

Fig. 8
figure 8

PCA of vibration FFT features and relation to out-of-process measurements

The results from PCA indicate that amongst the out-of-process variables examined, PCA is most likely to be useful in distinguishing oversize holes from average or undersize holes. Further, we have learned that while the relationship may be visible using high-level features alone, PCA of the vibration FFT signal appears to separate oversized holes more clearly and with a greater margin.

6 Supervised Dimensionality Reduction via LDA (A Supervised Method)

In Sect. 5, we showed qualitatively that by processing vibration FFT-data with PCA, the first two principal components reveal two clusters within our dataset: one mapping to holes with small to average diameter, the other mapping to holes with a large diameter. This section examines if the data supports distinguishing three clusters instead of the two found by PCA. For this task, we employ LDA for dimensionality reduction. Recall from Sect. 3, that LDA is a supervised method, in that it takes as input both the in-process data (here, FFT features) and the out-of-process data we are interested in (here, diameter classified as either “undersized,” “average,” or “oversized”) and then seeks the lower dimensional space that produces the best separation between the classes. Figure 9 shows the FFT data with dimensionality reduced by LDA. Both figures show the same set of points. Figure 9a colors these points by hole diameter. LDA, however, does not use the hole diameter measurements directly and instead relies on class membership shown in Fig. 9b. Similar to PCA, we can see that “large” diameter holes separate out reasonably well (they form a reasonable cluster of orange points). However, blue and gray points (representing “undersize” and “average” sized holes, respectively) are distributed in a fashion that makes it difficult to see a clear margin between those classes. Thus, analysis with both PCA and LDA suggest that: (a) it is possible to separate out large diameter holes from small and average diameter holes using FFT of vibration data; and (b) within this dataset, these methods do not provide a clear justification for further distinguishing “undersize” holes from “average” holes using the FFT of vibration data.

Fig. 9
figure 9

LDA of vibration FFT features and relation to hole diameter

7 Predictive Performance for Hole Diameter via kNN (A Supervised Method)

We now aim to quantitatively validate the predictive performance of the clusters obtained via previous dimensionality reduction. Previously we looked qualitatively at the shape of the data clusters and observed that they provided some margin for distinguishing large diameter holes from small or average diameter holes. Here, we employ kNN to predict a hole’s class and its diameter using data from its neighbors. Specifically, we separate data into groups of training and testing data as per leave-one-out cross validation (LOOCV) discussed in Sect. 3.4. For each group of training data, we perform PCA or LDA to reduce the dimensionality of the in-process data. Next, we store the location of each training example along with an out-of-process measurement of its diameter. Then, for each hole in our testing data, we perform the same dimensionality reduction as applied to the training data and then use this to determine its five nearest neighbors from the training set. Finally, we predict the class of each datapoint in the test set. Figure 10 below illustrates results from two folds of the cross-validation.

Fig. 10
figure 10

Two folds of validation: circles, squares, and diamonds represent training data, test instance, and nearest neighbors to the test instance, respectively; color represents oversized hole diameter (orange) and non-oversized hole diameter (blue). (Color figure online)

In Fig. 10, data has been separated into a training set (represented by circular and diamond markers), and a testing set (represented by a single square marker in each plot). Each fold thus uses all the data except for one hole for training, and the hole that is left out of the training set is tested. PCA is calculated using the training data (circular points). Testing data (square points) are projected onto the axes calculated from PCA and then labeled by searching for the k = 5 nearest neighbors (diamond markers) from the training data and finding the majority class from these points. Class label is denoted with orange vs. blue coloring. Class is provided for training instances and predicted for testing instances. The figure represents two folds or two iterations of the leave-one-out cross-validation process. In all, seventy-eight folds are performed such that all points eventually take one turn as test data. In Fig. 8a, five neighbors (diamond markers) are used to predict the class of the testing data (square marker). Two neighbors have the class label ‘not-oversized’ (blue diamonds), and three neighbors voters have the class label ‘oversized diameter’ (orange diamonds). The class of the test instance is given by the majority class of its neighbors. Thus, the test instance (square marker) is correctly predicted to belong to the class ‘oversized’ (orange). In Fig. 8b, the same process is carried out. Here, all five neighbors belong to the class ‘not-oversized’ (blue diamonds). The test instance is thus also correctly predicted to be ‘not-oversized.’

Figure 10 shows the results of predicting the class of two data points using PCA on the FFT of vibration data. Table 3 shows the aggregate prediction results on all the data using leave-one-out cross validation. Our main concern in this experiment is to find oversized holes, and prior results from Sects. 5 and 6 have not provided a solid foundation for separating “average” and “undersized” holes. Thus, we focus on discriminating oversized holes from all others. We calculate accuracy in Table 3 as

$$ \frac{{\mathop \sum \nolimits_{i} C_{i,i} }}{{\mathop \sum \nolimits_{i,j} C_{i,j} }} $$

where \({C}_{i,j}\) represents the number of elements that were predicted to belong to class i while actually belonging to class j. Thus, we divide the number of correct predictions by the total number of predictions made.

Table 3 Accuracy of each machine learning technique in hole diameter prediction

Table 3 shows the accuracy of five prediction methods. The first three rows show accuracy for methods where in-process data, which are high-level (time domain) features in row 1, or FFT (frequency domain) features in rows 2 and 3, are processed with PCA or LDA and then used with kNN to predict the hole diameter as “oversized” or “not oversized.” These results are obtained as discussed in Fig. 10, and differ only by lower dimensional projections of the data points. The fourth row of Table 3 differs slightly in that PCA is first run on the FFT features, and then the five nearest neighbors are identified just as was done for Fig. 10 or row 2 (PCA of FFT features). However, instead of simply using the most common class amongst these five neighbors as the prediction, we use the average of the five neighbor’s hole diameters, weighted by their relative distance from the data point we want to predict. In this fashion, nearer neighbors contribute more to the prediction, and the prediction takes into account details on those neighbors’ specific diameters. It is notable that all of the methods that use frequency-domain features out-perform PCA on the high-level time-domain features. Further, it’s also notable that PCA of FFT features with weighted averaging (last row of Table 3) performs best. This method takes the most advantage of the spatial information provided by PCA when making a prediction about an unseen data sample. Although frequency domain features are more computationally expensive to calculate than high-level features, this work is still easily performed in real-time, and is well justified by the 5% performance improvement.

8 Conclusion

This study applied both unsupervised learning and supervised learning algorithms to sort abnormal holes from average holes when drilling CFRP. Using the data captured directly during the machining process, we examined how the machine learning pipeline of feature engineering, dimensionality reduction, and finally prediction with k-Nearest Neighbors could be applied to discriminate oversize holes from undersize or average diameter holes. The following conclusions were drawn from the experimental results presented here.

  1. 1.

    The PVD diamond-coated carbide drill produced minimal defects when machining eighty CFRP holes. A set of abnormal holes possessing relatively poor quality across all out-of-process measurements did not exist; however, randomly ordered holes possessed defects in a wide range and variation.

  2. 2.

    Across measurements from seven out-of-process variables (diameter, roundness, surface roughness, entry/exit delamination, and entry/exit uncut fiber area), within our dataset, in-process measurements were most able to find outliers with respect to diameter. However, the low variance of this particular dataset may have hidden some relations between the in-process measurements and other out-of-process variables.

  3. 3.

    The ability to distinguish oversize holes from undersize or average holes could be qualitatively demonstrated using PCA of high-level features, PCA of vibration FFT, and LDA of vibration FFT. PCA of vibration FFT produced a better margin for separating oversize holes from average or undersize holes than PCA of the twenty-two high-level features engineered from the in-process data. LDA did not provide productive justification for attempting to distinguish between undersize and average diameter holes.

  4. 4.

    Predictive performance shows that our machine learning pipeline can predict oversized /non-oversized holes with over 85% accuracy in this dataset. Although hole diameter remains a somewhat random process, prediction performance improves when data is viewed from the frequency domain (FFT) as opposed to the time domain (high-level signals) and when prediction incorporates information about the relative distance of the k nearest neighbors.