1 Introduction & literature review

Both composite materials and acoustic emission are not new in the engineering world. Acoustic emission when used during the experimental research of the composite material can produce large datasets. The challenge is to understand the dataset. The material used in the research presented in this paper is additionally interesting for that matter as they include both metal and composite material, which amplify a variety of different damage that can occur in the material.

Acoustic emission was used to describe the behavior of material even from ancient times. As an example, usually, the plastic deformation of the tin is given, as it can be heard with naked ears. The first systematic attempts were proven in the twentieth century when audible emissions were noted for metals, such as tin, zinc, and cast iron [1]. The first scientific experiment was conducted in 1933 by F. Kishinoue in Tokyo [2]. These experiences led to the founding of acoustic emission as a practical method of assessing concrete damage by J. Kaiser in the 1950s of the 20th century [2]. The idea of acoustic emission is connected to the stress wave, which is produced as an effect of energy released during the splitting of crack surfaces. The elastic wave is recorded through sensors that are attached to the tested object. Usually, sensors are utilizing the piezoelectric effect (thanks to PZT, lead zirconate titanate) [3]. Stress waves are registered during the life of the object or during the course of the experiment in the form of time-based signals. As part of post-processing, the wave parameters are usually obtained directly from the acoustic emission system. A simple presentation of the wave and its standard parameters is shown in Fig. 1. The abbreviations and the meanings of particular characteristics are explained in Table 1. These types of features are usually described as basic as they are calculated directly from the signal obtained during experiments. It is also possible to obtain derived features calculated as some ratio, sum, or function of basic ones.

Fig. 1
figure 1

Example of elastic wave registered with acoustic emission system and typical features of such wave

Table 1 Description of basic features derived from the signal

In the presented research, experimental tests were conducted for composites belonging to the fiber metal laminates. This type of composite is characterized by containing both metal and reinforced composite layers. They were originally designed as a material for aircraft, as they are characterized by excellent fatigue properties. The most well-known materials from this group are GLARE (glass-reinforced epoxy with aluminum), CARALL (carbon-reinforced epoxy with aluminum), and ARALL (aramid-reinforced epoxy with aluminum). However, in recent years, a significant increase in work on FMLs has been observed. Different modifications to FML are proposed. In some research, aluminum is substituted with other metals like titanium [4], steel, and magnesium [5]. Another possibility is to use natural fibers, such as flax [6], hemp [7], kenaf [8], and sugar palm [9], instead of synthetic fibers. Finally, the matrix material can be changed from epoxy resin to another, for example, thermoplastic materials, such as PEEK [10], polyamide [11], or polypropylene [12]. In this paper, the latter will be used because the fiber metal laminates used in the experimental part are made of aluminum and fiber-reinforced polyamide. The fiber metal laminates are tested in various ways. The focus is often on fatigue testing, as in the past, this type of material has proved good properties in this area. Previous research carried out by authors on similar thermoplastic fiber metal laminates was carried out for different loading conditions: mixed-mode loading (three-point bending tests) [11], double cantilever beam test (DCB) for mode I [13], and end-notched flexural test for mode II [13] and focused primarily on experimental and numerical aspects of the research. Acoustic emission was used as an auxiliary method with some successes, but it was clear, that it can provide much more information with proper methods and approaches, which was motivation to conduct presented research on new configurations of material.

Acoustic emission is a popular measurement method, but nevertheless, there is not so much research on AE applied to FML tests. Al-Azzawi et al. [14] applied acoustic emission to detect the initiation and propagation of damage under quasistatic and fatigue loading. Another exemplary work is the Ph.D. thesis written by John McCrory who focused on GLARE material and also tested it using AE [15]. A short summary of the papers on Acoustic emission used in the FML research is presented in Table 2.

Table 2 Short review of research of FMLs with acoustic emission analysis

In recent years, artificial intelligence and machine learning in particular has gained a lot of traction in both scientific and common communities. Naturally, such methods are considered to use as a tool also in mechanical engineering science. Jenis et al. in [20] present a review of various machine learning techniques with recommendations on how they can be used in the creation of mechanical components. Patange and Pandya analyze the opportunities and the threats of machine learning (ML) techniques in the area of mechanical engineering. Nasiri and Khosravani in [21] present a review of data-driven approaches used to predict fatigue life and fracture of various components including various ML approaches like artificial neural networks (ANN), Gaussian regression, or neuro-fuzzy-based. In this paper, its’ authors note that acoustic emission is often considered a potential solution to detect damage. However, they also state the limitation, which is the limited ability of defect identification and proposes to use machine learning instead of acoustic emission. We propose to use both methods simultaneously for synergetic gains. In [22], authors present a review on predicting the mechanical behavior of additively manufactured parts including process parameters, the porosity of printed components and defects in printed parts. Often a big problem with applying machine learning techniques to mechanical engineering problems is the size of the dataset as usually number of experimental tests is limited. This can be offset using numerical methods like finite element analysis to generate more data based on experimental results. For example, Smolnicki in [23] used automatically generated FEA models to obtain a dataset with 15,000 records to predict reaction forces in a plate. Another possibility is to obtain data via other techniques—in this paper, we will use data registered with an acoustic emission system.

The main challenge in research with acoustic emission is the processing of the data. The classic approach used in most research is to investigate amplitude and frequency, as they are usually the two most informative features of the signal. However, by reducing the data to only these two parameters, we lose a lot of obtained information. Also, in more complex cases, other features could be more important than these two. The main issue with analyzing more features is a problem with their visualization, which makes it hard to manually decide about the clustering of the data points. This issue can be solved by applying artificial intelligence methods—more precisely, machine learning techniques. Machine learning is intended to be used with larger datasets, and is proven to be very successful with it.

Machine learning techniques can be divided into three main categories [24]:

  • Supervised learning, which is focused on the mapping between input (Set of features) and output (result). The results obtained during the learning process are compared with the known correct answers. The parameters of the method are changed based on the comparison between these results and true answers until a satisfactory level of accuracy is achieved or further progress is not possible. These techniques are used mainly in problems of classification (determination of affiliation) and regression (prediction of resultant values).

  • Unsupervised learning is used in the case where the data is not labeled or has a 'true' value. In this case, we use only the input data. In this type of learning, the model is focused on presenting interesting structures in the data. We can distinguish clustering (grouping data points into clusters) and association (detecting rules in the dataset).

  • Reinforcement learning, which is a method in which the trial and error approach is utilized, by testing different sets of parameters the model gets the response from the environment (feedback).

In the case of composites, the causes of the failure can be various: matrix cracking, delamination of fibers or layers, fiber kinking or breaking, etc. Therefore, one of the goals is to split the obtained data set into groups (clusters) that can reflect one of these causes [25]. Because usually, we don’t know the right answers, the natural choice is to use unsupervised methods. Dieng et al. [16] use a machine learning approach to group signals and assign them to different types of damage i.e., matrix cracking, de-bonding, delamination/de-cohesion, and fiber cracking. The most popular approach is to utilize the so-called K-means method [26,27,28]. The advantages of the method, that make it popular, are its simplicity and low computational cost. Different attempts are present in the literature, where researchers try to improve methods. In their paper, Godin et al. [29] used not only K-means, but also assumed labels of the data used K-nearest neighbor, which is similar to K-means, but a supervised technique. The addition of harmony search was proposed by Pashmforoush et al. [30] in their classification of failure in polyester reinforced by glass fiber composites in the DCB test. The idea of the methods used in this research will be discussed in the methods section, as they are important for presented results and formulated conclusions.

A big part of analyzing acoustic emission datasets from experimental testing of composite material is pre-processing focused on choosing the right features to include in the clustering process. It is possible to either use the classic method of reducing dimensionality like PCA (principal component analysis) [26, 27, 31, 32] or follow other approaches like Laplacian feature selection, based on the Laplacian score proposed by He [33], which utilize the fact that observation within one cluster tends to be close to each other. Another issue is to determine the right number of clusters. For composites, it is usually assumed that the number of clusters should not be higher than 5 (assuming we cluster the data based on the failure cause).

In this paper, the result of research conducted on thermoplastic fiber metal laminates is presented. This composite material was tested under mode I and mode II loading conditions in DCB (double cantilever beam) (Fig. 2a) and ENF (end-notched flexural test) (Fig. 2b) tests respectively.

Fig. 2
figure 2

The schematic idea of the DCB test (left) and ENF (right). Both tests are realized on initially pre-cracked specimen

During these tests, elastic waves were registered using an acoustic emission system. Some features of these waves were obtained directly from the system, and others were calculated in the post-analysis of the data. The investigation in the paper is focused on analyzing obtained data—distinguishing the causes of damage as clusters of data points. Instead of manual analysis following only chosen parameters, such as frequency and amplitude, the data is processed using machine learning techniques. First, the best features are determined utilizing the Laplacian score. Then, three different clustering techniques are used and compared—namely K-means, fuzzy C-means, and spectral clustering. The paper is concluded by pointing good and bad sides of chosen methods and formulating recommendations for future research on composite materials utilizing acoustic emission analysis and machine learning techniques.

2 Materials and methods

2.1 Materials

In this paper, experimental research was realized on composite material from the group of fiber metal laminates. The composite consisted of layers of metal (aluminum alloy AW-6061 T6) and fiber-reinforced composite—Celstran® CFR-TP PA6 GF60-01 (60% of E-glass by weight polyamide 6 continuous unidirectional fiber-reinforced thermoplastic composite tape). Two configurations of composite layers were included with \(0^\circ\) and \(90^\circ\) fibers. A graphical illustration of the used material is presented below in Fig. 3. The material was manufactured in a typical process for fiber metal laminates based on thermoplastic matrices. Metal plate and composite prepregs were cut, laid in the form, and then bonded with high pressure and temperature using Collin hydraulic press (Labor Plattenpresse P 300 PM) with heating. In order to prepare the initial crack in a specimen, polytetrafluoroethylene film was used, which ensured no bonding during the technological process in the designed place.

Fig. 3
figure 3

The schematic idea of material used in the research. On the left, additionally, the localization of the PTFE film included for the initial crack is presented

2.2 Experimental setup

All tests that were conducted in order to obtain acoustic emission data were realized on Instron 5944 with a maximum force of 2 kN. This is enough for researched materials, as the pre-crack is introduced to interface between metal and composite layers. All specimens were cut from manufactured plates using a circular saw. The dimensions of the specimen were 160 mm × 20 mm for ENF tests and 180 mm × 20 mm for DCB tests as the DCB specimen needs additional mounting. In the case of the DCB specimen, auxiliary aluminum blocks were glued on both sides of one end of the specimen. For both types of specimens, two acoustic emission sensors were mounted with hot glue on two ends of the specimen. The experimental setup for the DCB test is presented in Fig. 4.

Fig. 4
figure 4

Proper DCB tests were conducted using INSTRON 5944

Two types of tests were realized during the presented research:

  • Loading of the interface in opening mode (mode I). This test was based on ASTM standard [34] with necessary changes, as the standards are not meant for FML materials, and didn’t take into concern the fact of different stiffness of layers, the character of metal-composite interface.

  • Loading of the interface in shearing mode (mode II). This test was based on ASTM standard D7905 [35] and ESIS ENF test protocol [36] with necessary changes required because of the character of the material.

Important parameters of both tests are presented in Table 3

Table 3 Parameters of DCB and ENF tests

To monitor elastic waves, acoustic emission system produced by Vallen Company was used (AMSY-6). In both tests, two passive piezoelectric sensors were used, capable of monitoring signals with frequencies between 100 and 450 kHz. To amplify the signals, two wide-band preamplifiers were used (AEP5). Two sensors were used to enhance the quality of the obtained dataset as well as enable location analysis.

2.3 Acoustic emission & ML techniques

The first step in the classification of damage causes utilizing acoustic emission to obtain the data during experiments. Acoustic emission signals were registered during the experimental procedures and saved in form of the databases. The Vallen system stores main data in *.pridb files (Vallen Primary Data) and *.tradb (Vallen transient data). After using a fast Fourier transform (FFT), it is also possible to obtain the data about the frequencies of the signals in the form of another database. This database contains basic features of acoustic events, such as amplitude, duration, rise time, etc. However, it is also possible to define derivative features calculated as arithmetic operations between basic features. The list of features used in this study with their abbreviations and explanations is presented below in Table 4.

Table 4 Features (basic and derivative) used in the presented research

In order to process the data more easily and create the derivative features mentioned above, databases were combined into a resultant database with DB Browser and scripts written in SQL (SQL lite).

The preparation of the final analysis requires the limitation of the dimensionality to decrease computational cost. It can be done by choosing features that in the best way explain differences between data. To achieve this, we introduce the Laplacian score to our data. This is a method proposed by He [33] and is based on the assumption that data points from one cluster should be close to themselves. Data from the database were read into Python script. Then we follow the proposed algorithm by constructing an affinity matrix and conducting calculations. A lot of used methods to investigate data are influenced by the magnitude of data values over a different axis. Thus, a standard approach is to revalue data from initial ranges to for example range (0, 1). In the research, after checking popular approaches, we decided to use the so-called “Min–Max scaler”. It transforms the data \(X\) into \({X}_{new}\) in the following manner:

$${X}_{new}=\frac{X-\mathrm{min}\left(X\right)}{\mathrm{max}\left(X\right)-\mathrm{min}(X)}$$
(1)

The next step after standardization of the data and choosing the best features to use in analysis is to decide the number of clusters. Some clustering algorithms, such as OPTICS, can skip this step. Anyway, there are a few methods in determining the best number of clusters used in the current research. In this paper, we analyze three of them: the Davies–Bouldin index, the Silhouette approach, and the Calinski–Harabasz score.

Davies–Bouldin index (DB index, see [37]) is calculated for \(k\) clusters and based on inter-cluster separation \(\delta\) between clusters (where \({X}_{n}\) denotes cluster number \(n\)) and intra-cluster dispersion \(\Delta\) (averaged distance between the centroid of cluster and cluster points) in a cluster. The formula to calculate the index is presented below:

$$DB\left(u\right)=\frac{1}{k}\sum_{i=1}^{k}\underset{i\ne j}{\mathrm{max}}\left\{\frac{\Delta \left({X}_{i}\right)+\Delta \left({X}_{j}\right)}{\delta \left({X}_{i}, {X}_{j}\right)}\right\}$$
(2)

The lower the Davies–Bouldin index, the better clusterization is.

In the Silhouette approach [38], the score is calculated for each cluster independently as an average of values derived for each of its points. For a single data point, the score is based on the average distance between it and all other points in its cluster \(a(i)\) as well as the average distance to all other cluster centroids \(b(i)\). The formulas for one point \(i\) and cluster \(n\) containing \(k\) points are presented below:

$$s\left(i\right)=\frac{b\left(i\right)-a(i)}{\mathrm{max}\{a\left(i\right), b\left(i\right)\}}$$
(3)
$$S\left(n\right)=\frac{\sum_{i=1}^{k}s(i)}{k}$$
(4)

The higher the Silhouette score, the better clusterization is.

Calinski–Harabasz score [39] is calculated for one of \(k\) cluster of dataset \(E\)(size \({n}_{E})\) based on the ratio of the inter-cluster dispersion mean and the intra-cluster dispersion:

$$s\left( n \right) = \frac{{tr\left( {B_{k} } \right)}}{{tr\left( {W_{k} } \right)}} \cdot \frac{{n_{E} - k}}{{k - 1}}$$
(5)

where \(tr\) is the trace function, and \({W}_{k}\) and \({B}_{k}\) are inter-cluster and intra-cluster dispersion matrices respectively. The higher the Calinski–Harabasz score, the better the clusterization is.

Technically, all the above-described methods are used in order to rate the quality of the clusterization. However, their use to determine the best cluster number is straightforward. Using Python script written by us, we read the data from the prepared database, standardize them using the min–max scaler, and after that, we run clusterization by chosen method (for example K-means) in the loop for different numbers of clusters between 2 and 9. For each clusterization, we calculate the scores’ values based on a formula between 2 and 5. Finally, we can plot the values obtained on the chart and determine the best cluster number.

After determining the best features to use and the preferred number of clusters, the next step is clusterization itself. There is a wide range of machine learning techniques that can be used for classification problems, as presented in the Introduction to this paper. In this research, we focus on three different unsupervised methods: K-means, fuzzy C-means, and spectral clustering.

K-means is one of the grouping methods based on the iterative relocation of the data points into clusters. The method was first proposed by MacQueen in 1967 [40] and by now few modifications of this method in terms of its implementation have been proposed like Lloyd and Hartigan’s methods. The K-means method is one of the fastest clustering algorithms, which influences its popularity. Also, the clustering will always be done (the algorithm surely converges). In the K-means method, the number of clusters must be determined a priori. The main limitation is the fact that k-means clustering tends to find the local minimum and not optimal partition in a global sense. Those issues are usually solved by running the algorithm a dozen times with different initialization parameters or choosing the initial centroid using some heuristics. Another issue is that the K-means algorithm has to fully decide to which cluster each of the data points belongs. In reality, especially in problems like composite damage, we have a lot of data points that have features suggesting membership to multiple clusters. The solution to this issue can be the utilization of fuzzy clustering, which is characterized by the fact that data points in the outer part of the cluster belong to this cluster to a lesser degree. In this paper, a specific algorithm from this group is used—namely fuzzy C-means clustering (FCM) which was developed by Dunn and Bezdek [41, 42]. It can be implemented and used with Python scripting [43]

Spectral clustering is a method based on an affinity matrix. In the process, low-dimensional embedding of such matrix occurs, and after that, standard method like k-means is used for final clustering. This is a method designed for highly non-convex geometry of clusters, which can be the case in signals registered from experimental research of composites. Unfortunately, it works best in a case with an even size of clusters, which cannot be true in the investigated cases.

The summary of all three methods in the form of a block diagram is presented in Fig. 5.

Fig. 5
figure 5

Clustering algorithms used in the presented research: K-means, fuzzy C-means, and spectral clustering

For the purpose of better summarizing the method used in this study, in Table 5, advantages and disadvantages are compared for all three clustering techniques.

Table 5 Summary of advantages and disadvantages of using clustering techniques

3 Results and discussion

Experiments were registered in terms of force and displacement, but in this research, we are focusing on signals from acoustic emission and their meaning. However, as an example in Fig. 6, we present acoustic emission events in the function of time (amplitude on the vertical axis and frequency on the color axis) along with force in the function of time. A high correlation between the observed physical behavior of the specimens during a test and the registered acoustic emission is observed. Because of that, it is possible to assume that analysis of acoustic emission events can explain fracture processes in specimens.

Fig. 6
figure 6

AE events are described by amplitude (vertical axis) and frequency (color axis) presented along with force in the function of time

The dataset (database) obtained from the acoustic emission system was pre-processed using a DB browser for SQLite and SQL scripts written by authors. After that, the resultant database was processed using Python scripts as described in the methods section. Data were cleaned and derived features were calculated. All used features of the signals were described in the methods section in Table 4. Features, such as (basic and derivative), are used in the presented research. After that, for both datasets, from DCB and ENF tests, the Laplacian score was calculated for each feature. The Laplacian score was calculated for both non-normalized and normalized data. Normalization was realized using “min–max scaler” (see Eq. 1). The results of score calculation for both options are presented in Fig. 7 and Fig. 8. We choose the best features for both datasets, deciding to use the same sets in both cases.

Fig. 7
figure 7

Calculated Laplacian score over different features for the DCB dataset. On the left without any scaling of the data, on the right after applying Min—Max scaler

Fig. 8
figure 8

Calculated Laplacian score over different features for the ENF dataset. On the left without any scaling of the data, on the right after applying Min—Max scaler

After the features used in the study were concluded, it was possible to determine the best number of clusters using (as this analysis has to be done for a strictly defined set of features). The effect of such an analysis is presented in Fig. 9 for the ENF dataset. We can conclude that in this case, three clusters represent the data in the best way as that case has clearly the lowest values of the DB index as well as the highest values of the Calinski–Harabasz score and Silhouette score. Similarly, in Fig. 10, result of such investigation is presented for the DCB dataset. In this case, the results are more ambiguous, however taking into account that 5 clusters have the smallest value for DB index and the highest for Caliński–Harabasz and Silhouette beside 2 and 3 clusters (which can be discarded by analyzing the trend for the first three quantities of clusters). Other candidates—7 clusters have little physical justification and values worse than 5 clusters. Therefore, we decided to use 5 clusters for the DCB dataset and 3 clusters for the ENF dataset. From the point of view of fracture mechanics of analyzed materials, they can refer to some subsets of metal plasticity/damage, matrix cracking, fiber breaking, fiber kinking, delamination, and fiber–matrix de-bonding. Also, we observe the bridging effect during DCB tests, so it is also a possible source of acoustic events.

Fig. 9
figure 9

Investigation of the best number of clusters for the ENF dataset using three approaches: Davies–Bouldin index, Calinski–Harabasz, and Silhouette

Fig. 10
figure 10

Investigation of the best number of clusters for the DCB dataset using three approaches: Davies–Bouldin index, Calinski–Harabasz, and Silhouette

To better show the difference between a potential number of clusters in the ENF dataset in Fig. 11, silhouette charts for two cases (3 clusters and 5 clusters) are presented along with the average silhouette score of all clusters. The silhouette score compares how every data point belonging to a given cluster is similar to other points in this cluster in comparison to the similarity to data points outside of this data cluster. Therefore, the key information in Fig. 11 is the value of the average silhouette score (indicated by a red dashed line). For the case with 5 clusters, it is around 20% higher (0.46 versus 0.38). Additionally, Fig. 11 also includes information about the distribution of silhouette scores for every data point with the distinction of clusters. Based on that, we can investigate two additional factors—the presence of the clusters that are entirely below the average silhouette score (here we don’t have such a case for both numbers of clusters—which is a good sign.) and the fluctuation in the score inside clusters (here fluctuations are bigger in the 5 cluster case, which confirms that it is worse case between two presented).

Fig. 11
figure 11

Comparison between silhouette scores for the case with 3 clusters (left) and with 5 clusters (right)

Finally, after determining the best number of clusters, it is possible to test three proposed methods of clustering: K-means, fuzzy C-means, and spectral clustering. The results for the dataset obtained from the DCB test of composite material are presented in Figs. 12, 13, and 14. In these figures, results are shown in the grid showing a combination of each of the 5 extracted features (amplitude, frequency, number of counts, duration, and rise time) with each other. So on the main diagonal, we can only see a distribution of centroids for each feature (only for fuzzy c-means and k-means, as spectral clustering doesn’t have centroids defined). The grid is ‘symmetrical’ relative to this diagonal, but we decided to leave both x versus y and y versus x charts on the grid as an esthetic choice. This form of presentation may seem not very straightforward, but problematic visualization of 5-dimensional data is one of the reasons to conduct such analysis with machine learning, and also the reason why usually manual analysis is limited to two/three features only. Analyzing the difference between the three used methods few conclusions can be made. Regarding frequency–amplitude, K-means (Fig. 14) predict only one cluster in the lower frequencies (around 200) regardless of the level of amplitude, while both fuzzy c-means (Fig. 12) and spectral clustering (Fig. 13) suggest two clusters of data there—which is more along with general knowledge—which states that usually high-amplitude events have other causes than low-amplitude events. Fuzzy c-means have this split on constant amplitude, while spectral clustering (which is known for being able to catch convex geometry of cluster) suggests a more complex split. This is along with our assumption that this method will be able to produce more interesting results. While amplitude/frequency analysis is the most common one, the analysis of other pairs shows why standard approach has to be enhanced. Three (fuzzy C-means, spectral) or four (K-means) other clusters that in the frequency–amplitude domain are overlapping (high frequency/lower amplitude) are distinct if we analyze another domain like frequency–duration (i. e. green and dark pink/blue clusters on spectral plot). The continuing dark pink and blue clusters (Fig. 13) are distinct if we analyze pairs of rise time—duration and rise time—counts. The conclusion based on these considerations is that we shouldn’t limit our acoustic emission to only two features as other features also contribute to it. The conclusion is that initial knowledge about found clusters would be very useful, as the literature lacks deep analysis of different features from AE focusing mostly only on amplitude and frequency.

Fig. 12
figure 12

Effects of fuzzy C-mean clustering on DCB dataset (number of clusters set to 5) with following features A, c-freq, Counts/D, log(D), log(RT). For readability reasons only ticks at the boundary are left, as the results are presented in each versus each style in a grid

Fig. 13
figure 13

Effects of K-means clustering on DCB dataset (number of clusters set to 5) with following features A, c-freq, Counts/D, log(D), log(RT). For readability reasons, only ticks at the boundary are left, as the results are presented in each versus each style in a grid

Fig. 14
figure 14

Effects of spectral clustering on DCB dataset (number of clusters set to 5) with following features A, c-freq, Counts/D, log(D), log(RT). For readability reasons, only ticks at the boundary are left, as the results are presented in each versus each style in a grid

Similarly, results for the dataset obtained from the DCB test of composite material are presented in Figs. 1516 and 17. Due to the lower number of clusters and character of the dataset itself, the results obtained from different methods are more aligned here in comparison to ENF case. Thus, with the potential less complex geometry of clusters fast, straightforward methods such as K-means can give similar results to more sophisticated approaches. It is worth underlining that fuzzy C-means (Fig. 15) results are presented in a way that assigns a color based on the highest membership score. However, detailed information for each data point contains information about the probability of membership to each cluster, which can result in more natural information, as we can expect that in the overlapping area, the confidence of results is worse. Nevertheless, we can observe a split proposed for amplitude over 65 dB and below 50 dB.

Fig. 15
figure 15

Effects of fuzzy C-mean clustering on ENF dataset (number of clusters set to 3) with following features A, c-freq, Counts/D, log(D), log(RT). For readability reasons, only ticks at the boundary are left, as the results are presented in each versus each style in a grid

Fig. 16
figure 16

Effects of spectral clustering on ENF dataset (number of clusters set to 3) with following features A, c-freq, Counts/D, log(D), log(RT). For readability reasons, only ticks at the boundary are left, as the results are presented in each versus each style in a grid

Fig. 17
figure 17

Effects of K-means clustering on ENF dataset (number of clusters set to 3) with the following features A, c-freq, Counts/D, log(D), log(RT). For readability reasons, only ticks at the boundary are left, as the results are presented in each versus each style in a grid

In the case of DCB results, differences in clustering are visible between all three methods. The K-means method distinguishes only one cluster in lower frequencies. On the other hand, both other methods distinguish at least two splittings more or less over the amplitude parameter. Reported by researchers, boundaries based on frequency and amplitude can be various as shown in the review paper on acoustic emission in composite laminates by Saeedifar and Zarouchas [44], but the analysis of these examples suggests that the K-means method is the least realistic for this dataset. Spectral clustering and fuzzy C-means are generally similar in location of the clusters with some differences in boundaries between clusters. It is hard to asses which clustering is better, so in the next stage of the research campaign, authors will include additional tests that will enable checking the quality in a better way than comparing with literature propositions by post-mortem tests and non-destructive testing during tests. Contrary to DCB analysis, in the case of the ENF dataset, all of the proposed methods are giving similar results with small differences in cluster boundaries.

4 Conclusions

In the paper, results of preliminary studies on the fracture of fiber metal laminate composites are presented. Research is focused on investigating acoustic emission signals obtained during experimental DCB and ENF tests using machine learning:

  • The acoustic emission signals dataset was pre-processed using SQL and Python scripts. In this process, for each acoustic emission event, features were assigned—including basic ones like amplitude, frequency, duration, etc. as well as derived—calculated from basic ones. The importance of each feature was assessed by running a Laplacian score analysis. This method is in the opinion of the authors, a good tool for determining the most useful set of features in further analysis. The preferred number of clusters was determined using three different methods: The Davies–Bouldin index, the Silhouette method, and the Calinski–Harabasz score. All three methods suggested that the best number of clusters is 5 for DCB tests and 3 for ENF tests. However, the Silhouette method seems the least reliable in use for the composite tests acoustic events analysis. This is due to the complexity of obtained datasets and the overlapping of clusters. Two other methods are giving very similar results.

  • For researched material, fuzzy C-means and spectral clustering were given similar clustering results with small differences. However, the K-means method for one of the datasets gave less realistic results and should be treated with caution in similar research. Ideally, K-means should be only used if the initial analysis shows that the potential cluster shape is not very convex.

  • All tested clustering methods have a bias toward creating clusters with similar order of size, which is not desirable behavior in terms of composite test datasets, as we expect, for example, fiber events to be rarer than matrix cracking or delamination. This suggests that other clustering methods could be checked in future, which can include different-sized clusters more naturally. It should be one of the important issues to consider during the future choice of methods.

The purpose of the paper was achieved as we successfully realized data preparation, feature extraction, and determination of a number of clusters and tested three proposed methods of clustering. Presented approaches can be used in the next stages of our research campaign as well as by other researchers in similar problems. Significant discrepancies in literature data about particular damage case features boundaries cause issues with assigning found clusters to the exact type of damage. Analysis of available reviews suggests that these boundaries may be dependent on particular types of hybrid composites. Because of that, in future research, as a conclusion from this paper, we propose to conduct preliminary studies with well-defined causes of damage and use this knowledge to better understand resultant clusters. It will also enable some of the supervised machine learning techniques to be used in the analysis of acoustic emission event datasets.