1 Introduction

Due to repeated loads, earthquake events, extensive static loads, environmental effects, etc., damage accumulates in structures. In order to monitor the condition of structures for safety reasons especially in real time, acoustic emission (AE) is considered one of the most effective nondestructive testing (NDT) monitoring methods [1, 2]. In the field of structural health monitoring (SHM), AE has been widely used to monitor the condition of construction materials or structures. Due to its significance and continuous monitoring capabilities, critical infrastructures like nuclear storage tanks, hydraulic structures and bridges have been studied in terms of durability and safety [3,4,5,6]. In AE, the elastic waves resulting from irreversible changes in materials are recorded as electrical signals by the sensors, which precisely are analyzed to reflect damage in a variety of materials and structures. AE method provides useful information about damage detection such as damage location, damage mode and early failure prediction in structures [7]. In the phenomenon of AE, the primary waves known as P-waves are the first measure of the signal arrival at the sensor, utilized to determine the location of the damage. Electrical AE signals that contain demonstrative data of elastic waves can be used to extract many AE features, which helps in the detection and evaluation of damage in materials. Through AE features, it has been possible to characterize the type or severity of damage. For instance, AE energy can be used to evaluate fracture growth [8]. Similarly, the amount of AE activity can be inferred to estimate the active crack density and modes [7, 9]. In addition, AE parameter analysis and AE-SiGMA analysis can be used to classify cracks [10, 11]. Using AE-SiGMA analysis, the identification of crack kinematics of reinforced concrete (RC) elements in terms of the crack type, orientation and locations of cracks has been effectively identified [11,12,13,14]. The above-mentioned studies have been proven well in the detection and evaluation of damage; however, due to the anisotropic properties of concrete or composite materials, it becomes difficult to identify the sources of AE signals due to spurious events or noises while detecting damage. Aggregates, air gaps and cracks cause attenuation and distortion in wave propagation, leading to variations in the signal characteristics (such as amplitude and energy loss and frequency dispersion), which consequently affect AE parameters. This leads to data misinterpretation and inaccurate detection of damage and limits the development of automated and real-time monitoring of reinforced concrete (RC) structures. Hence, the study's motivation for addressing this issue is the necessity to identify these types of signals deviating from the damage characteristics due to reflection, etc. Furthermore, filtering out all these noise sources from recorded data and emphasizing the actual AE source is important for accurate monitoring and source localization of damage.

The signals that are not relevant to AE sources are known as AE noise caused by friction, reflections and environmental impacts like rain, dust, wind, etc. Before the assessment of structures, the signal processing is usually carried out in either the time domain, frequency domain or time–frequency domain, depending upon the complexity of filtration to eliminate or enhance the signals. Different studies applied the basic level of filtering by setting a threshold for AE data acquisition system, a band-pass frequency filter, peak definition time, hit definition time and hit lockout time, which resulted in eliminating noise-related signals [14,15,16,17]. Applying parameter-based filters is simple by using data acquisition software. Signals with extremely short durations can be discarded using duration filters, whereas electric noise with significantly high signal strengths can be filtered using a signal strength filter [18]. Some researchers also define frequency filters to eliminate noise based on very low-frequency signals [19, 20]. Moreover, [21] proposed a filtering standard based on the root mean square of AE signals to separate mechanical friction from concrete cracking.

In the time domain, the traditional AE data filtration is usually done by the Swansong II filter which is also a parameter-based filter. It is also known as the duration–amplitude (D–A) filter and has been adopted in many experimental studies based on amplitude–duration ranges for use as rejection limits considered by [22, 23]. Swansong II filter has produced promising results in a variety of studies for filtering AE data gathered from fatigue crack growth, localized corrosion and degradation in concrete, and cracking of concrete during load tests [18, 24,25,26]. Additionally, it was observed that these fundamental filters considerably improved the source location results of concrete cracking and demonstrated good agreement with visually seen cracks. However, these approaches are challenging to interpret all the AE data easily as it consists of a large amount of data which is difficult to handle even in carefully controlled laboratory environments. Interpreting large amounts of data collected during the monitoring of a structure even in a simple bending test of an RC beam is very time-consuming.

In order to eliminate these problems, some previous research [27, 28] has mainly focused on de-noising methods before a concrete damage assessment by using machine learning or deep learning but AE data filtration at an early stage of data acquisition has not yet been applied using deep learning. In a recent study, a convolutional neural network (CNN) was trained on a dataset of AE signals generated by concrete specimens under various loading conditions [29]. The CNN was able to effectively remove noise from the AE signals and improve the signal-to-noise ratio, resulting in more accurate strain measurements. Similarly, other studies have also applied deep learning techniques, such as recurrent neural networks (RNNs) and autoencoders, to de-noise AE signals obtained from concrete specimens [30, 31]. These techniques have shown promising results in both improving the signal-to-noise ratio and extracting meaningful features from AE data. [32] introduced an innovative transfer learning approach to generate numerical AE signals for training, coupled with unsupervised domain adaptation to enhance source localization accuracy when labeled realistic AE signals are insufficient. In another study, [33], by using a deep learning algorithmic model, the digital data was efficiently de-noised to remove ambient noises in the detection and localization of cracks in steel rail subjected to loading. On the other hand, [9] used a heterogeneous ensemble learning framework, which is based on continuous wavelet transform, to learn different features from AE signals for evaluating alkali–silica reaction damage in concrete. Their findings imply that the suggested model performs well in dividing the AE signals into their designated phases.

Within this scope, there has been a notable increase in the use of signal processing techniques in the field of structural engineering. Numerous studies have investigated diverse approaches for examining and interpreting signals for SHM and damage assessment.

[34] provided a comprehensive review focusing on signal processing techniques designed for vibration-based SHM in civil structures and highlighted the approaches applied in real time, recognizing the difficulties in managing large and noisy datasets gathered from a variety of sensors. Similarly, [4] provided a study on the application of signal processing techniques using discrete-time Fourier transform, wavelet and Wiener transforms to analyze differences between undamaged and damaged regions on concrete gravity dam. The study found that the wavelet transform performs better in identifying structural deterioration and has a strong ability to handle noise variance in recorded observations. Emphasizing the signal processing [35] assessed the structural condition of concrete beams subjected to bending using wavelet transform on -AE signals. By calculating wavelet energy from AE signals, the approach accurately determined the structural state, distinguishing between damaged conditions of the concrete beam.

On the contrary, all these studies were based on de-noising of AE signals rather than filtrating the AE signals at an early stage. Thus, based on artificial intelligence, a model that automatically learns essential features from AE data filtration and distinguishes between true AE signals and noise-induced signals, is needed to perform with the help of deep learning because it has the potential to be a powerful tool for the filtration of AE data in concrete specimens and other applications. However, it is important to carefully select and preprocess the training data to ensure that the proposed model is accurate and reliable.

Apart from the above-mentioned studies, a one-dimensional convolution neural network (1D-CNN) deep learning approach was used to improve the quality of filtering AE signals to reduce the amount of data without losing any significant information and to increase the reliability of the filtering method. The model utilizes the AE parameters for AE data filtering to separate true AE signals from noise-induced signals in an RC member. Additionally, the results of this study would lead to enhanced monitoring of structures with accurate and reliable identification using suitable signal processing and deep learning approaches in real time.

2 Methods

2.1 Acoustic Emission (AE)

AE is one of the NDT methods used to locate cracks, damages and provide information about the performance of the material or a structure. Fundamentally, AE is based on the concept of detection of propagated elastic waves due to energy released from damage in a material under stress. These waveforms are detectable by piezoelectric sensors that convert the mechanical waves into electrical signals, which are processed by a pre-amplifier, filters and power amplifier in the processing stage to record and display AE data on the computer. The AE data acquired from the data acquisition system is further processed and analyzed with mainly two types of analysis; parametric and waveform-based analysis. In the parameter-based approach, signal parameters, such as amplitude, duration, rise time, count and energy, are utilized to evaluate the nature and classification of damage. For instance, while amplitude and energy are used to identify the magnitude of the damage, rise time and frequency are associated with the damage type. On the other hand, location, time of origin and orientation of the damage can be determined by waveform-based analysis [36, 37].

2.2 Neighborhood Component Analysis (NCA)

NCA is a powerful machine learning technique that has been applied to a wide range of applications, including image recognition, natural language processing and AE signal processing [38,39,40]. This analysis is a variant of linear discriminant analysis (LDA) that is designed to handle large datasets with a large number of features. It is particularly well suited for tasks involving complex and nonlinear relationships between the features and the target variables. It works by finding the linear combination of features that maximally separates the different classes in the dataset. This allows NCA to identify the most relevant features for a given task and to eliminate features that are redundant or irrelevant. The process of choosing the best features in a dataset is known as feature selection, and it is based on the evaluation results that the algorithm utilizes in the analysis. By decreasing the dimensionality of the feature space, it improves the precision of machine learning algorithms for classification and regression problems. Furthermore, by removing less significant features from the dataset, this dimension reduction procedure enhances the quality of the data rather than causing data loss.

By calculating the weights of each feature and optimizing the classification accuracy goal, NCA algorithms learn the weight vector of features [41]. These calculated feature weights are used to identify the significant and insignificant features in the training dataset of the NCA algorithms expressed as

$$P = \left\{ {\left( {a_i ,b_i } \right), i = 1,2,3, \ldots s} \right\}$$
(1)

where ai is a m-dimensional feature vector, bi  {1, 2, …, n} is the relevant class label of ai and s is the number of input samples A point “a” is chosen randomly from the set of points P as the reference point for ai in the NCA. In order to predict the expected label, a classifier must be trained to define a weight vector v. The distance between two samples ai and aj is calculated using a distance function (dv), which is indicated by [42] in accordance with the weighting vector v is expressed as

$$d_v \left( {a_i ,a_j } \right) = \mathop \sum \limits_{t = 1}^m v_t^2 |a_{it} - a_{jt} |$$
(2)

where vt is the feature weight.

The leave-one-out (LOO) approach which is a nonparametric method is used on the P training set to improve classification accuracy in cross-validation [43]. This is used to estimate the generalization error of the model by leaving out one sample at a time during the training phase, and then using the left-out sample to evaluate the model. In the probability of choosing point aj, a kernel function is used to approximate the reference point for bi is given the expression.

$$Y_{ij} = \frac{{k\left( {d_v \left( {a_i ,a_j } \right)} \right)}}{{\sum_{j = 1 j \ne 1}^s k\left( {d_v \left( {a_i ,a_j } \right)} \right)}}$$
(3)

k is a kernel function given by:

$$k(c) = {\text{exp}}\left( { - \frac{c}{\sigma }} \right)$$
(4)

where σ is the kernel width indicating an input feature.

To classify the sample i correctly with a randomized classifier using Y−i, i refers to the average LOO probability of correct classification, expressed as;

$$Y_i = \mathop \sum \limits_{j = 1 j \ne 1}^s Y_{ij} z_{ij}$$
(5)

where

$$z_{ij} = \left\{ { \begin{array}{*{20}c} {\begin{array}{*{20}c} \\ {1 \qquad if {\text{z}}_{\text{i}} = {\text{z}}_{\text{j }} } \\ \end{array} } \\ {0\qquad {\text{otherwise}}} \\ \end{array} } \right.{ }$$
(6)

As a result, the LOO technique's approximate classification accuracy can be described as follows:

$$C(v) = \mathop \sum \limits_{i = s }^s Y_i - \lambda \mathop \sum \limits_{i = 1}^m v_t^2 = \mathop \sum \limits_{i = 1 }^s \mathop \sum \limits_{j = 1 j \ne 1}^s Y_{ij} z_{ij} - \lambda \mathop \sum \limits_{i = 1}^m v_t^2$$
(7)

where λ is known as a regularization parameter that can be converted using cross-validation and is greater than 0. Besides this, the feature weight vector v, which is used to calculate the classification accuracy C(v), is achieved by minimizing the regularization parameter λ.

2.3 1-Dimensional Convolutional Neural Network (1D-CNN)

ID-CNNs have met substantial popularity in a relatively brief period, attaining state-of-the-art proficiency in several signal processing applications like fault detection in structures [44]. 1D-CNN models help automatically extract relevant features from the time series data and make predictions about the damaged state of the structure. These models capture the temporal dependencies and patterns in AE signals, allowing for accurate classification and anomaly detection. In contrast to 2D-CNNs, this methodology directly processes 1D signals, yet retains the capability of autonomously learning complex features from the training samples.

Although 1D-CNNs have been used successfully in various signal processing applications, their application in filtering AE signals in concrete specimens has yet to be investigated. To date, there has been no systematic study on the performance of 1D-CNNs in this context, and the optimal network structure and hyper-parameters for this application have not been determined. This lack of research represents a gap in understanding the capabilities and limitations of 1D-CNNs for filtering AE signals recorded from concrete. At this point, the objective of this research is to examine the performance of 1D-CNNs in filtering AE signals and to determine the optimal network structure and hyper-parameters for this application.

1D-CNNs consist of three main components: the convolution layer, the pooling layer and the fully connected layer. The input layer of the 1D-CNN receives the relevant features, which are then subjected to convolution operations with the appropriate convolution kernels to generate the input feature maps. These maps are then passed through an activation function to produce the output feature maps from the convolution layer, as expressed by [45].

$$z_j^l = f\left( {a_j^l + \sum \limits_{i \in {\text{M}}_j } Conv1D(\omega_{ij}^{l - 1} ,y_i^{l - 1} ) } \right)$$
(8)

The output of the jth neuron at layer l represented as \(z_j^l\) is calculated by combining the scalar bias, \(a_j^l\) and the input maps, Mj, selected from the previous layer, through a nonlinear function, f(.). The input maps are determined by the output, \(y_i^{l - 1}\), of the ith neuron at layer l-1, and the kernel weight, \({{\omega }_{ij}}^{l-1}\), connecting the ith neuron at layer l-1 to the jth neuron at layer l.

The convolution layer is followed by a pooling layer, which serves the dual purpose of reducing computational costs through feature dimension reduction and enhancing feature translation invariance. The formula for the pooling layer is described in [45].

$$p_j^{l + 1} = f(\beta_j^l {\text{down}}\left( {z_j^l } \right) + a_j^{l + 1} )$$
(9)

The pooling layer's output, represented by the subsampling function "down(.)," is multiplied by a weighting coefficient \(a_j^l\) and has a bias coefficient \(a_j^{l + 1}\). The output from each pooling layer p neuron serves as the input for each neuron in the fully connected layer, which typically acts as the classifier in the 1D-CNN.

3 Experimental Setup

In the laboratory, an RC beam measuring 2350 × 250 × 150 mm designed for flexural failure was cast using the concrete mix design specified in Table 1 and reinforcement details shown in Fig. 1. Three-point-bending test was carried out using the universal testing machine with a monotonic loading rate of 30 N/s. In order to monitor the failure process of the beam, the AE test was synchronized with the load–deflection test. The AE data acquisition system by MISTRAS consisted of AE apparatus PAC-SAMOS 8, piezoelectric sensors and pre-amplifiers were used to record AE data during the test. Eight R15α AE sensors having resonance frequency of 150 kHz were placed on the beam using silicon grease as shown in Fig. 1. Here, the locations of the sensors were designed by taking into account the attenuation property of the concrete material, thus ensuring that the maximum distance between the farthest two sensors should not exceed 1 m. Thus, under these circumstances, all sensors were able to receive all potential activities. Furthermore, 3D-localized events were required to be detected by at least four AE sensors to construct distance equations and solve four unknowns (coordinates and time of origin). To find out if the AE sensors are well coupled, the Hsu–Nielsen calibration technique, in accordance with ASTM E97, was used to artificially generate sound waves by breaking pencil leads at 45° angle to the surface of the beam near each sensor for the pre-calibration of AE acquisition. Additionally, this calibration test was utilized to calculate the wave velocity in all paths, which is then inserted into the source localization algorithm as an inverse problem.

Table 1 Mix design of the concrete used for production of the RC beam (kg/m3)
Fig. 1
figure 1

Loading and AE measurement setup of RC beam

4 Proposed Filtering Methodology

4.1 Signal Preprocessing and Feature Extraction

In this study, a novel filtering framework has been presented to filter clean AE signals from noise-induced signals to improve the accuracy and reliability of damage assessment of structures using suitable signal processing and deep learning approaches. In this scope, the steps followed from signal preprocessing to the development and testing of the proposed filtering model is shown in Fig. 2.

Fig. 2
figure 2

Schematic flow of proposed filtering methodology for the classification of AE signals

Initially, the AE data acquired from the test was pre-processed through the threshold and Swansong II filter. This step is considered essential for signal processing as it enhances the quality and relevance of data by removing noise and irrelevant information. The pre-amplifier gain and acquisition threshold both were set to 40 dB to amplify the AE signals and eliminate the effect of the ambient noise. The traditional parameter-based filter known as the Swansong II filter was applied with the duration–amplitude (D–A) rejection limits as shown in Table 2.

Table 2 Swansong II filter, duration–amplitude (D–A) rejection limits [46]

To create a dataset of two distinct AE signal classes for the deep learning model, these limits were taken into consideration for the identification of clean and noise-induced signals. The application of Swansong filter on AE dataset is shown in Fig. 3. This filter acts as a discriminating agent by efficiently splitting the AE signals into two clusters based on their inherent characteristics of amplitude and duration. This resulting clustering representation of the signals serves as a valuable technique by finding distinctive pattern within the signal data.

Fig. 3
figure 3

Utilizing the Swansong filter for finding distinct patterns in AE signals

Following the essential stage of signal clustering, the subsequent step involves feature extraction, a process that enhances the relevant information embedded within the identified signal clusters. AE signals acquired from the AE acquisition system contain demonstrative data of elastic waves in the form of AE features. These features consist of valuable information used in the characterization and localization of damage [47, 48]. In this study, besides common AE features such as amplitude, rise time, duration and energy, statistical time domain and frequency domain features of the AE signals were also selected for a set of features as listed in Table 3. By identifying intricate patterns and relationships within the signals, these features not only help in the careful selection and the computation of relevant features but also allow the deep learning model to operate more efficiently.

Table 3 AE features used for the dataset

4.2 Selection of Significant Features for Classification of AE Signals

The selection of significant features for classification of AE signals is an important step in the analysis and interpretation of AE data. In this study, a total of 21,768 AE signals along with their class labels were used for feature selection. The regularization parameter λ is a hyper-parameter that can be used for dimensionality reduction, feature selection and classification. It is applied as a regularization term to the objective function to avoid overfitting of the model. The regularization term encourages the model to select fewer features, which can improve the generalization performance of the model. By increasing the value of λ, the model is less sensitive to small fluctuations in the data and will tend to select only the most important features. Therefore, to determine the optimal value of λ depends on the specific dataset which can be selected using techniques such as cross-validation. In this study, the λ was adjusted against the dataset split into 80/20 ratio (80% of the data to the training set and the remaining 20% assigned to the validation set) for fivefold cross verification. By analyzing the mean squared error (MSE) of the loss values, it was determined how well the model can predict the true loss values, giving an idea on the accuracy and reliability of the model given by

$${\text{MSE}} = \frac{1}{s}\mathop \sum \limits_{i = 1}^s (y_i - y_j ) ^2$$
(10)

The results of NCA analysis were recorded and the loss of 0.059 was achieved when λ was 0.00024 as show in Fig. 4. To execute the NCA feature selection, the best λ and minimum loss value were used to find the significant AE features.

Fig. 4
figure 4

MSE loss values based on regularization parameter (λ)

For significant feature selection, feature weights for all the AE parameters were estimated based on the best λ and MSE values as shown in Fig. 5. To avoid false biased weighted estimation, all the normalized weights were estimated, and threshold limit was set to 0.01 as shown in Fig. 6.

Fig. 5
figure 5

Feature weighting based on best λ and MSE values

Fig. 6
figure 6

Normalized feature weights based on the best λ and MSE values (the feature index and corresponding AE parameters are given in Table 3)

In this study, the performance of an AE signal classification model was evaluated by analyzing the normalized weights of 26 AE parameters. Normalized weights are a measure of the relative importance of each AE parameter in explaining the variation in the dataset. The unit normalization method was adopted to preserve the relative magnitudes of the weights which makes it easier to compare in a wide range of values in the dataset. They were calculated by normalizing the absolute weights by the sum of all absolute weights, so that they sum up to 1. The larger the normalized weight, the more important the parameter is considered. It was found that 12 out of the 26 AE parameters had normalized weights less than a pre-defined threshold. These parameters were determined to be insignificant, as they had a very low contribution to the variation in the data and did not provide any useful information for the classification of AE signals. As a result, these 12 parameters were removed from the model, resulting in a simpler and more accurate model. The remaining 14 parameters were determined to be significant, as they had normalized weights above the threshold, indicating that they have an important contribution to the variation in the data and provide useful information for the classification of AE signals. These AE features listed in Table 4 were considered for the AE signal classification using 1D-CNN model in the next section.

Table 4 Selected significant AE features for the dataset

From Fig. 6, it was observed that absolute energy (feature index no. 23) shows the highest normalized weight of 1 among all considered AE parameters. This parameter provides a measure of the overall intensity of an AE event, which can be used to differentiate between AE signals generated from different sources. A higher absolute energy value generally indicates a stronger and more intense AE event, which can be useful for identifying the presence and severity of structural damage. The high normalized weight assigned to absolute energy in the result suggests that it is a strong predictor of the variation in the data and can be used to improve the accuracy of AE signal classification. Furthermore, the analysis suggests that it is important to identify which parameters are the most significant in explaining the variation in the data by using normalized weights while analyzing the results of AE signal classification studies. By identifying and removing insignificant features, the model can be simplified, and its performance can be improved.

4.3 AE Signal Classification by 1D-CNN

In this study, the 1D-CNN model was employed to classify a dataset of 21,768 AE signals obtained from a bending test of an RC beam. A visual representation of the general architecture of 1D-CNN can be seen in Fig. 7. Two input classes of clean AE signals and noise-induced signals were integrated into the model with its 14 most significant features from each class obtained from the NCA feature selection algorithm. To effectively classify these two classes, AE signals were labeled as 1 and 2 demonstrating the clean AE signals and noise-induced signals, respectively. In scope of this, 1D-CNN was built with two convolutional layers, each with a ReLU (rectified linear unit) activation function and a normalization layer, using a kernel size of 3 and 32 filters. A global average pooling layer was used after the convolutional layer to reduce computation cost and feature dimension. The final layer was a fully connected layer, acting as a classifier, using the Softmax function to classify the AE signals into clean and noise categories in the output layer.

Fig. 7
figure 7

The general structure of a 1D-CNN

While the depth of the CNN model, kernel size and number of filters impact the spatial resolution of feature maps, the decision on the batch size, number of epochs and learning rate further contribute to the efficient convergence and fine-tuning of the model. These network parameters play a crucial role in optimizing and fine-tuning the network through an empirical method. The batch size refers to the number of training samples used in each iteration, and the number of epochs determines the times the learning algorithm goes through the entire training dataset. The learning rate, on the other hand, is a tuning parameter in an optimization algorithm that determines the step size at each iteration as it moves toward the minimum of a loss function.

In this study, careful consideration and tuning of these parameters were performed to examine and identify the most effective parameter combination for AE signal classification. This was done through the selection of batch size, which influenced the training speed and generalization while with maximum epoch, the model was regulated within the training duration to avoid the risk of overfitting and underfitting. Additionally, the implication of the learning rate contributed to the convergence and stability of the model. The range used for fine-tuning the network based on different network parameters is listed in Table 5.

Table 5 Different network parameters and their values

For the development and evaluation of the model, the data comprising 21,768 AE signals was randomly divided into training, validation and testing sets. It is challenging to overstate the significance of dataset size in training CNNs. The size of the dataset possesses a significant impact on the models' performance as well as their ability for generalization. In this study, using around 22,000 AE signals in the 1D-CNN model it was assumed that the model would exhibit satisfactory performance. For this purpose, precisely 70% of the AE dataset, 15,238 AE signals were allocated for training the model. Fifteen percent of the dataset, equivalent to 3265 AE signals, were designated for validation and the remaining 15% of the dataset, comprising 3265 AE signals, were allocated for testing set. The model was trained on the training set and fine-tuned via the validation set to optimize the network in a supervised manner and minimize the error in predicting the AE signal classes. To evaluate the performance of the models, all training, validation and testing results of the models were achieved in terms of the model’s accuracy and discussed in the 1D-CNN classification optimization section.

4.4 1D-CNN Classification Optimization

4.4.1 Minimum Batch Size

The effect of different batch sizes on the classification performance of the 1D-CNN model was analyzed by evaluating the variation trend of the classification accuracy of both the training and validation samples, as shown in Fig. 8. According to the values listed in Table 5, the analysis showed that at a batch size of 25, the accuracies of both the training and validation samples were 96.11% and 95.71%, respectively. However, when the batch size was increased to 50, the trend remained consistent, but at a batch size of 75, the network showed overfitting, resulting in a validation accuracy of 96.45% which was higher than the training accuracy of 95.90%. To rectify this issue, the network was trained and tested with larger batch sizes. It was determined that an optimal batch size was 125, as it achieved a training accuracy of 96.36% and a testing accuracy of 96.26%. Further increase in batch size resulted in a decline in both training and testing accuracy.

Fig. 8
figure 8

Model’s accuracy trend based on min. batch size

4.4.2 Maximum Epochs

The classification performance of the 1D-CNN model was analyzed concerning various maximum epochs values as indicated in Table 5. The trend of the classification accuracies of both the training and validation samples was evaluated based on the variation in maximum epochs, as illustrated in Fig. 9. Initially, when the maximum epochs were set to a low value of up to 75, the network was observed to overfit. This was evidenced by the higher validation accuracy compared to the training accuracy. However, when the maximum epochs were increased to 125, the network achieved its maximum classification accuracies of both the training and validation samples, with values of 97.15% and 97.05%, respectively. Hence, a maximum epoch of 125 was considered optimal for the network. It was also found that increasing the maximum epochs beyond 125 decreased the classification accuracies of both the training and validation samples.

Fig. 9
figure 9

Model’s accuracy trend based on max. epochs

4.4.3 Learning Rate

The variation trend in the classification accuracies of both the training and validation samples as a function of the learning rate was analyzed and is shown in Fig. 10. The results showed that as the learning rate increased from small values, both training and validation accuracies improved. The highest classification accuracy was achieved when the learning rate was 0.001, with training accuracy and validation accuracy reaching 97.34% and 97.05%, respectively. On the contrary, a marked decrease in both training and validation accuracies was observed when the learning rate exceeded 0.001, consequently indicating an optimal learning rate value of 0.001 for a network.

Fig. 10
figure 10

Model’s accuracy trend based on learning rate

5 Results and Discussion

5.1 AE Analysis and Experimental Load Test

After the failure of the test beam, the damage process was evaluated by developing a relationship between the AE parameters and time along with the corresponding load values. This relationship was developed both for the raw and clean AE signals to highlight the importance of filtering. To investigate the mechanical behavior of the beam and its AE activities, AE parameters such as the amount of AE hits and AE energy were synchronized with the load.

For damage association between mechanical and AE findings, Figs. 11 and 12 show AE parameter analysis results for both raw and clean AE signals. The load is shown in blue color plotted against AE hits and AE energy as shown in red lines and red dots. As illustrated, in both cases AE activities are aligned with the load, while the beam is already in proportional limit. However, when the beam reaches its yield point, there is a prominent rise in the AE hits and AE energy, indicating the origination of the cracks. During testing, as the time reached up to 500th seconds, the beam started losing its load-bearing capacity, indicating the expansion of cracks. Moreover, at the ultimate load, massive amount of AE hits and AE energies were recorded due to the sudden collapse of the beam. As seen, there are more AE hits in raw distribution than in clean ones in every stage of the behavior. This behavior is recognized by incorporating a 200-moving average trend of AE energy (shown in black color) into the load time graph. In Fig. 12a, around 500th and 1000th sec, there is a sudden change in the trend of AE energy due to abrupt higher AE energy values where a very small decrease in the load was observed. This was due to noise-related AE signals leading to higher AE energy values of more than 500aJ. On the other hand in Fig. 12b when the clean AE data is plotted, the abrupt change in trend is reduced after filtering.

Fig. 11
figure 11

AE parameter analysis results: a AE hits vs load on raw signals. b AE hits vs load on clean AE signals

Fig. 12
figure 12

AE parameter analysis results: a AE energy vs load on raw signals. b AE energy vs load on clean AE signals

In Fig. 13, the significance of the AE parameters, i.e., amplitude and duration, is presented for both raw and clean AE data. As shown in Fig. 13a, the signals consisted of low-amplitude and high-duration, illustrating that distributed data points away from the cluster of clean signals result in incorrect identification of damages.

Fig. 13
figure 13

AE parameter analysis results: a Amplitude vs duration on raw signals. b Amplitude vs duration on clean AE signals

5.2 1D-CNN Model Results

Based on the results of investigating the effect of various hyper-parameters on the classification performance of the 1D-CNNs model, the optimal combination of network parameters was determined and is presented in Table 6. Utilizing these optimal hyper-parameters, the model was trained, validated and then tested using 15% of the available testing samples. The model's performance was evaluated using a confusion matrix with the results shown in Fig. 14.

Table 6 Optimal network parameters for the proposed model
Fig. 14
figure 14

Performance accuracy of the model based on confusion matrix

The performance of a classification model can be assessed through the utilization of a confusion matrix, which simplifies the evaluation of the observations of the model and their categorization in terms of the true class and predicted class. As illustrated in Fig. 14, the confusion matrix comprising both training and testing data displays the number of clean AE signals and noise-induced signal observations. The results demonstrate that the prediction accuracies of the optimal hyper-parameters during both the training and testing phases were 97% and 96.29%, respectively, which was found to be closely aligned with the individually calculated classification accuracies of the optimal parameters obtained during the network tuning process.

5.3 Verification of Proposed Filtration Model with Unseen Data

To evaluate the efficiency and applicability of the AE filtration model developed in this study, its application to real-world scenario was much needed. For that purpose, AE data of concrete cracking events collected under four-point bending of a concrete beam shown in Fig. 15, described in [49], was used as an unseen dataset to a model. The verification of the model was performed in terms of the source location of cracks developed on concrete beam during testing. The AE signals generated during the loading of concrete structures can exhibit varying levels of clean AE signals and noise-induced signals. In order to accurately determine the source location of cracks in concrete structures, it is essential to effectively distinguish between clean AE signals and noise-induced signals. This can be achieved using appropriate signal processing techniques with the help of machine learning or deep learning, such as the AE filtration model evaluated in this study. The performance of this model was assessed in terms of its ability to accurately predict the source location of cracks in the presence of clean AE signals.

Fig. 15
figure 15

Four-point bending test setup & AE sensor locations

The experiment was conducted on a shear-deficient 3200 mm long RC beam having a cross section of 150 × 250 mm. The beam was tested under four-point bending by using a hydraulic jack having 300 kN compressive and 150 kN tensile load capacity. The surface of the test specimen was equipped with eight 150 kHz AE sensors as shown in Fig. 15. The behavior of the beam was observed and monitored during the test by recording the load, vertical displacement and AE activities. The AE data was recorded using an 8-channel DiSP AE system by Mistras Holdings. The AE signals were amplified by 40 dB with the help of pre-amplifiers, and the amplitude threshold was set to 40 dB.

For verification purposes, instead of using traditional parameter-based filters, the raw AE data acquired from the AE data acquisition system was directly tested on the filtration model to predict clean AE signals and noise-induced signals. Furthermore, these predicted results were used for the source location analysis of cracks in the specimen.

A considerable amount of research in damage identification and source localization using the AE technique has been published [50,51,52,53,54]. The source location of the AE signals was determined using the arrival time based on Akaike’s information criterion (AIC) picker algorithm [55]. It was ensured that an AE event was recorded with the help of a minimum of four sensors that triggered the same source. In this study, only significant AE parameters, as determined by the model in the section of the NCA method, were considered for the classification of AE signals to evaluate the AE filtration model generated results with the visually observed cracks on the test specimen. A new AE dataset of 44,799 AE signals named as “unseen dataset” for the filtration model was prepared and tested. The model performance accuracy in terms of the confusion matrix is shown in Fig. 16. The performance evaluation of the AE filtration model was predicted on the unseen dataset and results were compared with the true labels of the Swansong II filter. Moreover, binary classification was conducted using several performance metrics including accuracy, true positive rate (also referred to as sensitivity or recall), true negative rate (specificity), precision (positive predictive value) and F1 Score. The results of these evaluations are presented in Table 7

Fig. 16
figure 16

Model performance accuracy based on confusion matrix on unseen data

Table 7 Performance metrics results of AE filtration Model

The results of this study indicate that the overall accuracy of the model was high during both the training and testing phases, with an accuracy of 97% and 96.26%, respectively, in classifying both clean AE signals and noise-induced signals. However, for the unseen dataset, it was found to be 92.8%, which was lower than the accuracy during the training and testing phases. This was due to the inherent variability in the unseen data that was not accounted for by the model. Nevertheless, 92.8% accuracy is quite high for making predictions on clean AE signals and noise-induced signals. The utilization of the proposed filtering model comprises around 22,000 AE signals yielded promising results in all datasets mentioned in Table 7 with accuracy more than 90%. This is evident that with a dataset of this size, significant accuracy is achievable in filtering noise from AE signals. From Fig. 16 of the confusion matrix for unseen data, it shows that out of 40,742 clean AE signals, 2518 were misclassified as noise-induced signals. Additionally, 696 out of 843 noise-induced signals were misclassified as clean signals.

The model's true positive rate (TPR) was high during all the phases, resulting in a TPR of 0.94–0.99. The TPR was measured to check the model's ability to correctly classify clean AE signals. This emphasizes the robustness of the model and its ability to accurately classify clean AE signals in the presence of noise. On the other hand, the true negative rate (TNR) of the model was lower for noise-induced signals compared to clean AE signals. The TNR during the training phase, testing phase and testing of the unseen dataset was found to be 0.80, 0.73 and 0.55, respectively. A TNR of 1 indicates that the model was able to accurately identify all noise-induced AE signals, while a lower TNR indicates that the model was less successful in identifying these signals. The lower TNR for noise-induced signals compared to clean AE signals indicates that the model may have difficulty in accurately identifying noise-induced signals. This was due to an imbalanced dataset between the clean AE and noise-induced signals. These results highlight the need for further research on balanced datasets, but in reality, it is difficult to acquire balanced AE signals during data acquisition. Hence, the study was performed by considering the real-time problem. It is important to note that the lower TNR value on the unseen dataset comprised almost half of the false signals as a clean AE signal which perhaps ensures not to lose the useful information in filtration. Furthermore, the impact of lower TNR might be verified in the source location analysis for accurate crack identification in the specimen.

Another performance metric includes precision which measures if the positive predictions made by the model are true positive or not. It was found that the precision of the model was high during both the training and testing phases, with a precision of 0.975 during the training phase and 0.965 during the testing phase. However, during the testing of the unseen dataset, this value was found to be 0.98. This higher precision on the unseen dataset indicates that the model is making 2% false positive predictions on new unseen data.

On the other hand, the F1 Score was used to measure the balance between precision and TPR, and a high F1 Score indicates that the model was making accurate positive predictions while also having a high TPR. The results showed that the F1 Score of the model was high during both the training and testing phases, with an F1 Score of 0.98 during the training phase and 0.977 during the testing phase. However, the F1 Score during the testing of the unseen dataset was found to be 0.93, which is lower than the F1 Score during the training and testing phases. These results highlight the importance of evaluating the F1 Score of a model on unseen data when evaluating an imbalanced dataset, which provided a more comprehensive measure of its accuracy in the filtration model of AE signals.

To locate the clean and noise-induced signals in the RC beam, source localization was performed. From the source localization algorithm, 900 events were produced representing clean AE signals while 21 events were achieved as noise-induced signals. To further visualize the cracks on the test specimen, the clean and noise-induced signals of the unseen dataset based on source location were superimposed on the test specimen as shown in Fig. 17. It was observed that the beam collapsed in shear failure and the majority of the sources calculated from the clean signals fell accurately inside the specimen dimension. On the other hand, only two events were located from the noise signals, since the majority of the sources calculated from the noise signals fell out of the specimen dimensions. Events outside the specimen indicate the influence of lower TNR where almost half of the signals were classified as clean signals. Since in actuality these are noise-induced signals, they were not located inside the specimen dimension which suggests that the proposed AE filtration model has the potential to accurately classify AE signals and provide valuable insight into source location determination.

Fig. 17
figure 17

Visual cracks superimposed with the events of clean & noise-induced AE signals

The applicability of an early-developed AE filtration model based on a three-point bending load test to another specimen subjected to a four-point bending test gives a good example to test the filtration model on unseen data. Despite the type of loading conditions, type of material, change of waveform velocity and imbalanced data, the model is successful in classifying the AE signals and applicable to further AE source location analysis. Hence, this result highlights the potential of the model to be applied in real-world scenarios. Further research can be conducted on the development of improved strategies like using a balanced dataset for an early stage AE filtration model.

6 Conclusions

A one-dimensional convolutional neural network (1D-CNN) deep learning approach was used in this study to improve the quality of filtering in acoustic emission (AE), reduce the amount of data without losing any significant information and increase the reliability of the filtering method. The model utilizes the significant parameter among AE parameters identified by neighborhood component analysis (NCA) for AE data filtering to separate true AE signals from noise-induced signals in a RC specimen. The results of the filtration model were also verified with the source location AE activities. Based on the results, the following conclusions are drawn.

  • Depending upon the material properties of the specimen and type of loading, among 26 AE parameters, 14 AE parameters named as mean, standard deviation, skewness, peak-to-peak value, peak amplitude, arrival time, rise time, energy, duration, amplitude, root mean square, absolute energy, frequency centroid and RA value were determined to be significant parameters derived from the results of NCA, indicating important contribution to the variation in the data for the classification of AE signals. Among significant parameters, absolute energy is determined as the most significant parameter indicating a measure of the overall intensity of an AE event, which is useful for identifying the presence and severity of structural damage.

  • According to the findings of this study, it was found that the classification performance of the 1D-CNN filtration model is influenced by various hyper-parameters, including batch size, number of epochs and learning rate. After optimizing these parameters, the optimal combination of network parameters was found to be a minimum batch size of 125, a maximum epoch of 125 and a learning rate of 0.001. Based on the results of the optimal combination of network parameters, a high classification accuracy of 97% and 96.29% during the training and testing phases, respectively, is observed. These findings suggest that the selected network parameters are effective in accurately classifying the AE data.

  • The efficiency and applicability of an early-developed AE filtration model is also verified with the source location AE activities collected during the four-point bending test and it was found that the overall accuracy of the model for the unseen dataset was found to be 92.8%, which was lower than the accuracy during the training and testing phases. It was concluded that despite the change in the type of loading and imbalanced dataset, 92.8% accuracy is quite high for making predictions on clean AE signals and noise-induced signals for unseen data.

  • The outcomes of this study demonstrate the potential of the AE filtration model to accurately classify AE signals obtained from RC beams both in three-point bending and four-point bending tests and proved that the proposed AE filtration model has the potential to accurately locate cracks or sources.

  • The proposed filtering method could be integrated with AE devices using software enabling real-time filtering to prevent the collection of unnecessary data. This integration would enhance the efficiency of the monitoring system by focusing on relevant information, particularly in dynamic environments.