1 Introduction

Concrete and steel are the most widely used construction materials. A well-designed concrete structure is durable, and the concrete cover provides protection to the steel from adverse environmental conditions. However, material degradation occurs over time [1], especially in coastal areas, through corrosion, due to the high chloride content of sea water [2]. The chloride ions percolate through concrete cover, thereby depleting the passivating layer and initiating rebar corrosion [3]. The corrosion products are expansive and create surface breaking cracks, thereby, increasing the risk of spalling, and further accelerating the penetration of more chloride ions [4]. Even though corrosion may take a long time to become evident, the process adversely affects strength and durability. Therefore, detecting rebar corrosion in the early stages, and an understanding of the severity level are necessary for efficient infrastructural asset management.

The inspection of RC structures involves visual assessment and other NDT techniques for decisions pertaining to repair and rehabilitation. However, detection of internal defects is a challenging task. There has been on-going research on the development of more effective and reliable NDE methodologies for accurate damage detection and systematic management of infrastructure. The most familiar and widely used in-situ electro-chemical technique to assess the corrosion vulnerability is the half-cell potential technique. The application and implementation procedure are described in the ASTM C876 code of practice [5]. Work associated with this technique for evaluating the probability of rebar corrosion is reported in [6, 7].

The other NDT methods, which use elastic waves, i.e., the impact echo (IE) and acoustic emission (AE) are applied for flaw and void detection in concrete structures. The authors in [8] examined the influence of impact duration of the IE technique to ascertain the depth and location of cracks, through analysis of the phase spectrum of received signals. IE based detection of voids in pre-stressed concrete girders, using frequency domain methods is reported in [9]. The authors concluded that the shift in frequency to the lower regime is associated with the presence of defects. However, the efficiency of the IE method is influenced by wavelength limitations, and the technique fails to detect smaller sized defects in concrete structures. The Acoustic Emission (AE) method involves real-time in-situ monitoring through reception of elastic waves by special surface mounted sensors in RC slabs to track the progressing development of cracks due to corrosion [10]. The effect of seasonal variations in temperature on the AE signals has been investigated in [11]. Further investigations related to distribution of corrosion damage along the rebar in concrete specimens using AE have been presented in [12]. The emphasis on the relationship between energy acquired through AE sensors and crack growth rate in pre-stressed tendons in bridges is examined in [13, 14].

Ground penetrating radar (GPR) based assessment, an electromagnetic NDT technique, has been implemented to detect the location of steel reinforcement [15, 16] and estimation of concrete cover depth and size of voids [17, 18]. GPR can be used for subsurface imaging and the effect of corrosion on GPR images is investigated in [16]. The authors report attenuation of reflected signal amplitude originating from corroded rebars. Analysis of GPR signals in time and frequency domain, for detection of rebar corrosion in concrete slabs is presented in [19]. The authors observe a reduction in the amplitude of signals, where there is a loss in the cross-sectional area due to pitting corrosion.

Ultrasonic NDT techniques are widely popular primarily for evaluating the quality of the concrete and for detection of subsurface flaws, and voids. The ultrasonic guided waves (UGW) approach has been adopted for detection and location of rebar debonding in [20]. Detection of cracks generated due to rebar corrosion in concrete beams using UGW is explored in [21]. Reduction in peak-to-peak amplitudes of UGW signals due to rebar corrosion in concrete beams has been reported in [22]. Subsurface imaging with ultrasound using SAFT has been pioneered by Schickert et al. [23] and subsequently applied for detection of artificial debonding around rebars in [24]. The authors in [25], investigate changes in the SAFT images of rebars subjected to various levels of accelerated corrosion. A comparative study on contact and non-contact based ultrasonic methods for imaging of rebars subjected to various stages of corrosion is presented in [26]. A gradual disappearance of the rebar signature in the SAFT images has been established as an indicator of progressive corrosion. Recent developments on advanced sensing and imaging systems have been presented in [27]. The authors in [28] implemented compressional sampling theory as a tool to improve classical ultrasonic tomography for damage localisation in structures. The SAFT imaging technique has been improved using spatial apodization filters in [29].

The literature review shows that the primary focus of NDT of concrete structures is on detection and localization of structural defects such as voids, cracks, estimation of concrete cover, and localization of rebar and prestressed tendons by utilizing time and frequency domain analysis and image processing algorithms. A few researchers have investigated statistical data driven approaches (Gaussian and Weibull mixture modelling) and machine learning to identify and characterize damage in materials like ceramic composites using AE parameters [30]. Similar AE parameter-based classification of subsurface damage in wind turbine blades, using GMM is reported in [31]. The classification of the extent of rebar corrosion in concrete structures, using linear discriminant analysis is discussed in [32]. Identification of tensile and shear crack damage in a shear wall using AE sensors and GMM is presented in [33]. The authors in [34], propose a K-NN and a random forest-based method for quantitative evaluation of chloride induced corrosion in concrete beams. Similar work through application of random forest and ANN (Artificial Neural Network) for prediction of chloride concentration in marine concrete is reported in [35]. Research related to evaluation of integrity and detection of defects in concrete pile foundations is demonstrated through deep learning in [36]. The application of linear regression and ANN for estimation of the initiation time of rebar corrosion is presented in [37] and prediction of the mechanical properties of rubberised concrete is investigated in [38]. The photographs of cracked and uncracked concrete bridge decks and pavements were used to determine the crack length and width [39] using convolutional neural networks (CNN).

The above summary demonstrates that there is a developing interest on implementation of statistical data and machine learning driven approaches for classification of damage severity in concrete structures. In this paper, the authors present a novel data driven approach using features extracted from SAFT images, to classify various levels of corrosion. To induce corrosion, a laboratory based accelerated corrosion setup is developed. Ultrasonic reflection data is acquired by scanning over the rebar locations in a 4 \(\times 4\) grid and SAFT images are generated. The features extracted from these images are input into a GMM classifier, which identifies corrosion severity. To the best of authors' knowledge, the image-based classification of corrosion and cracking severity in concrete structures has not been reported previously in the literature.

The paper is organised as follows. The details of the mix design, sample dimensions and rebar arrangement are presented in “Specimen Details”. The laboratory-based corrosion set up is described in Sect. 3. The SAFT imaging approach is discussed in “Methodology of Ultrasonic Scanning”. The imaging results are presented in “Results and Discussion section”. The classifier is discussed in detail in Sect. 9. The paper ends with a Sect.15.

2 Experimental Details

The geometrical arrangement of the rebars within the concrete slab samples are shown in Fig. 1. Three prismatic specimens of dimensions 4 \(50 \mathrm{mm}\times 450 \mathrm{mm}\times 120 \mathrm{mm}\) are fabricated with a grid of four embedded rebars of 16 mm diameter in the x-direction and four 12 mm diameter rebars in the y-direction. The length of the rebars is 350 mm inside the specimen. Ordinary Portland cement (53 MPa grade) and 20 mm nominal sized coarse aggregates were used to cast the specimens. In concrete slabs, the side cover is 40 mm, and a clear cover of 40 mm from the top is provided.

Fig. 1
figure 1

Schematic diagram of a cross section of slabs 1 and 2

The rebar nomenclature along \(X\) and \(Y\) axes are defined as \({X}_{m}^{n}\) and \({Y}_{m}^{n}\), where \(m\) refers to rebar number and \(n\) refers to slab specimen number.

The mix proportion (by weight) of concrete ingredients are shown in Table 1. The RC slabs were cured for 28 days in a curing tank before the experimental inspection. The applied ultrasonic scanning technique is explained in Sect. 4.

Table 1 Concrete mix design

3 Accelerated Corrosion Setup

The normal corrosion process may take a long time; in order to expedite the process, the accelerated corrosion technique is implemented. In the present setup, a potential of 30 Volts is generated by a constant DC source following developments in [8, 25, 26]. A tank is filled with 5% NaCl solution (electrolyte) and the sample is immersed partially up the level of the 16 mm rebar, i.e., the depth of the saline water is 40 mm.

The rebar is connected to the positive terminal and a copper plate is connected to the negative terminal of the DC source. Each corrosion cycle is of 24 h duration during which the current is recorded at intervals of 15 s, using a data acquisition system, as shown in Fig. 2.

Fig. 2
figure 2

Setup of Accelerated Corrosion for slabs

4 Ultrasonic Scanning Methodology

Ultrasonic scanning is performed above each rebar on the slab surface at various stages of corrosion. Compressional wave transducers with centre frequency of 250 kHz were used for transmission and reception in a tied-together mode along the rebar axis. The excitation signal was triggered with a pulser-receiver setup as shown in Fig. 3. The acquired A-scans were digitized in an oscilloscope (5 MHz sampling frequency) in a time window of 500 μs. Petroleum jelly is used as the couplant between the transducers and the concrete surface.

Fig. 3
figure 3

Ultrasonic scanning setup

The slab surface is marked with grid lines at an interval of 10 mm in both directions. A tied-together method of scanning is followed, where the source and receiver transducers are placed next to each other, and waveforms are acquired at various grid points along a linear aperture. The experimental inspection is carried out over the rebar and each aperture consists of 30 inspection points for every corrosion stage. A total of \(30\times 8 \times 18=4320\) wave forms were acquired to generate SAFT images for all samples. The acquired signals are analysed and processed by the SAFT algorithm, and vertical cross-sectional images through the rebar axes are generated corresponding to various levels of corrosion.

5 Imaging Approach

Vertical cross-sectional images of the rebars at different corrosion stages are generated using the SAFT program developed in the MATLAB platform, following the explanations provided in [23,24,25,26]. The waveforms, acquired at various locations as shown in Fig. 4, are first normalised to the initial surface wave arrival to circumvent coupling issues.

Fig. 4
figure 4

The Ultrasonic scanning approach

The Time of Flight (TOF) of the traveling ultrasonic compressional wave, originating from the source to a subsurface grid point and back to the receiver, is calculated through Eq. 1 as [25, 26]:

$$TOF=\left|\frac{{d}_{r}^{S}+{d}_{r}^{R}}{{V}_{c}}\right|$$
(1)

where, \({d}_{r}^{s}=\) distance between the source transducer \({S}_{i}\) and grid point \(r(m,n)\). \({d}_{r}^{R}=\) distance between the grid point \(r(m,n)\) and receiver transducer \({R}_{i}\) \(TOF=\) Time of flight from \({S}_{i}\) to \({R}_{i}\) via the grid point \(r(m,n)\) \({V}_{c}\) = Velocity of Compressional wave.

The image value \({I}_{m,n} ,\) associated with the grid point \(r(m,n)\) is obtained as a sum of the contributions from the A-scans received from K source-receiver combinations, as shown in Eq. 2 as [25, 26]:

$${I}_{m,n}={\sum }_{i=1}^{K}{A}_{i}(TOF)$$
(2)

where \({A}_{i}(t)\)= amplitude of the A-scan received at the receiver of the \({i}^{th}\) source-receiver pair.

The current technique involves the acquisition of signals reflected by rebars; therefore, any change to the rebar and the rebar-concrete interface are bound to affect the reflected signal. However, the inhomogeneous subsurface of concrete causes a lot of scattering of the incident field which creates structural noise, which might mask the objects to be identified. The coherent summation of the A-scan amplitudes, corresponding to various source-receiver locations and grid points in the SAFT image eliminates the noise in the data which can be significant in single A-scan measurements [23]. Further The SAFT imaging is an intuitive visualisation tool for inspecting features or changes in the concrete subsurface.

6 Results and Discussions

The observations during accelerated corrosion of concrete specimens such as mass loss of the extracted rebars and their images are presented in the next sub-section. The ultrasonic scanning is performed and acquired data is fed into the SAFT algorithm developed in the MATLAB platform. This generates images of the rebar at various levels of corrosion and classification of corrosion severity using these images is discussed in the following subsections.

6.1 Corrosion Process

The concrete specimens are subjected to an accelerated corrosion process, explained in Sect. 3. The current values are monitored at intervals of 15 s during each 24-h corrosion cycle and graphs of the current in amperes as a function of time are shown in Fig. 5.

Fig. 5
figure 5

Current plot for a slab 1; b slab 2

After every 24 h, the current value is observed to drop, which can be due to the reduction in the moisture content during the process of ultrasonic scanning between the corrosion cycles. The theoretical mass loss is estimated by integrating the current using Faraday's law as follows [32]:

$$\Delta m= \frac{M}{ZF}\underset{t1}{\overset{t2}{\int }}I dt$$
(3)

where, \(\Delta \) m = Mass loss in grams, \(M\) = Molar mass of Fe, \(Z\) = Valency of Fe, \(F\) = Faraday’s constant, \(I\) = Current in Amperes.

The slab 1 is corroded in six levels of corrosion, three of which are pre-cracking, and the remaining are cracked stages. Accelerated corrosion is performed at eight levels for slab 2, and ultrasonic data is collected at all phases. In slab 3, the corrosion experiment is performed until 11 days of corrosion and no cracks were also observed due to a faulty electrical connection.

The theoretical mass loss of the rebar arrangement after the entire accelerated corrosion process is shown in Table 2. After 144 h and 168 h of accelerated corrosion for slabs 1 and 2 respectively, the actual mass loss (weighed) of the rebar arrangement is compared with the theoretical value calculated from Eq. 3.

Table 2 Mass loss of the rebar system

The theoretical mass losses for the slabs 1 and 2 are 101.5 g and 131.5 g respectively. The actual mass losses obtained after the extraction of rebars are 96.2 g and 137.7 g, showing variations of 5.2% and 4.5% respectively, which are very reasonable. For slab 3, no change in the measured current was observed and there was a suspected loss of connection between the supply wire and the rebar grid. There was no loss of mass of the rebars.

Physical examination of the 16 mm rebars from slabs 1, 2 and 3 were performed post extraction after the final stage of corrosion, and the corresponding photographs are shown in the Fig. 6. Rebars from slabs 1 and 2 show pitting corrosion along with reduction in the cross-sectional diameter along the rebar. As shown in Figs. 6a, b, the side rebars (\({X}_{1}^{1},{X}_{4}^{1},{X}_{1}^{2},\mathrm{and}{ X}_{4}^{2}\)) are corroded more when compared to the middle rebars; the reason being that the exposure to NaCl solution is relatively more for the side rebars.

Fig. 6
figure 6

Photographs of extracted 16 mm rebars from a slab 1; b slab 2; c slab 3

The middle rebars are affected by the corrosion only at the ends. The corrosion process did not occur in slab 3 due to a faulty electrical connection and therefore, the rebars do not show any visible corrosion, as shown in the Fig. 6c.

The 12 mm layer of rebars extracted from the slabs 1, 2 and 3 are shown in Fig. 7. As explained in the Sect. 2, the concrete specimens are immersed up to 40 mm, due to which the 12 mm rebars have less exposure and do not undergo substantial corrosion. Only mild corrosion is observed at the ends, as shown in Fig. 7. A more detailed discussion is presented in Sect. 7.3.2

Fig. 7
figure 7

Photographs of extracted 12 mm rebars from a slab 1; b slab 2; c slab 3

Figures 8 and 9 show the cracked images of slabs 1 and 2. A fine surface breaking crack at the level of rebar is observed on the sides (A and B) after the 4th corrosion cycle in both slabs (Figs. 8a and 9a). With the progress in corrosion, the widening of cracks and oozing of corrosion product is observed (Figs. 8c and 9d).

Fig. 8
figure 8

Cracked images of slab 1 after a 4 days of corrosion (Cracked stage I); b 5 days of corrosion (Cracked stage II); c 6 days of corrosion (Cracked stage III)

Fig. 9
figure 9

Cracked images of slab 2 after a 4 days of corrosion (Cracked stage I); b 5 days of corrosion (Cracked stage II); c 6 days of corrosion (Cracked stage III); d 7 days of corrosion (Cracked stage IV)

6.2 Vertical SAFT Images with Ultrasonic Setup (slab 1–16 mm Rebars)

The vertical cross-sectional SAFT images of 16 mm rebars from slab 1 are discussed in this subsection.

The pre-cracking stage I and pre-cracking stage II images of rebar \({X}_{1}^{1}\) and \({X}_{4}^{1}\) (Fig. 10) depict a bright continuous patch of blue colour (negative amplitude) and red colour (positive amplitude) of high intensity at the level of rebar. In pre-cracking stage III (3 days of corrosion), the red coloured (+ ve amplitudes) patch starts to disappear from the SAFT image. After the cracked stage I (4 days of corrosion), a further weakening of the rebar intensity (+ ve amplitudes) is observed, and a fine crack starts to appear on the side surface (Fig. 8a). At the advanced level of corrosion, i.e., cracked stages II and III, the rebar intensities are diminished (+ ve and − ve amplitudes) considerably. At this level, the extracted rebars also show significant corrosion (Fig. 6a) and the horizontal cracks on the side surface at the level of rebar also widen (Figs. 8b, c).

Fig. 10
figure 10

SAFT images of rebar \({X}_{1}^{1},{X}_{4}^{1}\) of slab 1 at a pre-cracking stage I; b pre-cracking stage II; c pre-cracking stage III; d cracked stage I; e cracked stage II; f cracked stage III. The rectangular box has been added to emphasise the rebar location. Color bars are unitless

The SAFT images related to rebar \({X}_{2}^{1},{X}_{3}^{1}\) of slab 1 is shown in Fig. 11. The diminishing of rebar image amplitudes is not observed with progress in corrosion. This extracted rebars also do not show any sign of significant corrosion (Fig. 6a). The middle rebars are not affected by accelerated corrosion process due to the larger concrete cover on the sides, which explains the observation. The SAFT images from slabs 2 and 3 follow a similar trend and considering the length, these figures have not been included in this manuscript.

Fig. 11
figure 11

SAFT images of rebar \({X}_{2}^{1},{X}_{3}^{1}\) of slab 1 at a pre-cracking stage I; b pre-cracking stage II; c pre-cracking stage III; d cracked stage I; e cracked stage II; f cracked stage III. The rectangular box has been added to emphasise the rebar location. Color bars are unitless

The intensity of wave amplitude is affected by corrosion of rebar and the changes occur at the rebar-concrete interface. The ultrasonic wave undergoes scattering in multiple directions and significantly low amplitudes reach the receiver. In situations of severe corrosion, the changes owing to the accumulation of rust product and micro cracking of the surrounding concrete, which finally leads to surface breaking cracks as shown in the Fig. 12

Fig. 12
figure 12

Schematic representation of ultrasonic wave attenuation and dispersion due to corrosion

In summary, with the progress in corrosion, the rebar image intensities for slabs diminish (Fig. 10), which is consistent with authors' observations in previously published research [25, 26]. The reason may be attributed to attenuation and scattering of the incident compressional wave, due to the formation corrosion product around the rebars. Therefore, the rebar disappearance phenomenon in the SAFT images can be a diagnostic indicator for corrosion activity in concrete structures.

7 Statistical Classification

The classification of data into various categories originates from pattern recognition studies associated with machine vision. Different statistical learning algorithms have been developed to ensure better classification. These methods are categorised into two types: unsupervised and supervised learning methods. In unsupervised learning, labelled information of various classes is not available in the training data set. Whereas, in supervised learning, the pre-defined information is available for the classification process. Among various unsupervised learning methods, the Gaussian mixture modelling (GMM) is a widely used technique. In this article, we propose GMM for classification of the rebar corrosion severity, using features extracted from SAFT images. We provide a brief description about identifying the number of classes, the flow chart of the classification process, and the model development in the subsequent subsections.

7.1 Gaussian Mixture Modelling

Gaussian mixture modelling is an unsupervised classification method, in which the data is assumed to follow a multivariate normal distribution. This technique has been previously used in different areas such as civil engineering [33], medicine [40], and in the food industry [41]. The GMM algorithm calculates Gaussian functions that optimally create the boundaries separating the training data set into multiple clusters. The test data is classified according to their localization relative to the boundaries.

The flowchart in Fig. 13 is a brief description of the implementation of the GMM technique, adopted in this work. The SAFT images at various levels of corrosion from slab 1 (refer to Fig. 10 for details) is used as training data. Referring to Fig. 13, the white line emphasised by the red dotted ellipse represents the spatial data from which a 2 × 1 vector is extracted consisting of the feature vectors i.e., maximum, and minimum values; the same operation is performed on every A-scan in the saft image.

Fig. 13
figure 13

A flowchart of Gaussian mixture model algorithm

After extraction, the optimal GMM parameters (mean, covariance, and mixture coefficient) that best reflect the distribution of the training feature vectors are estimated through an iterative process. The technique is summarised in the following steps and the mathematical and algorithmic development follows the procedures using Eqs. 4, 5, 6, 7, 8 [33, 42, 43]. The first author developed the algorithm of GMM in MATLAB using Eqs. 4, 5, 6, 7, 8.

  1. 1.

    The mean (\({\mu }_{k}\)), covariance (\({\sum }_{k}\)) and mixture coefficient \(({\pi }_{k})\) are initialized where k denotes the number of a specific Gaussian cluster and k = 1 to N, with N as the assumed number of clusters.

  2. 2.

    The probability of each data point or feature vector \({x}_{j}\) (j = 1 to M) belonging to a cluster ‘k’ is calculated as follows

    $$P\left({x}_{j}|{{\pi }_{k},\mu }_{k},{\sum }_{k}\right)= \frac{{\pi }_{k}\times \mathcal{N}\left({x}_{j}|{\mu }_{k},{\sum }_{k}\right)}{\sum_{i=1}^{N}{\pi }_{i}\times \mathcal{N}\left({x}_{j}|{\mu }_{i},{\sum }_{i}\right)} $$
    (4)

    where,

\( {\mathcal{N}}\left( {x_{j} |\mu _{k} ,\sum _{k} } \right) = \frac{1}{{\sqrt {2\pi \left| {\sum _{i} } \right|} }}\exp \left\{ {\frac{1}{2}\left( {x_{j} - \mu _{k} } \right)^{T} \left| {\sum _{k} } \right|^{{ - 1}} \left( {x_{j} - \mu _{k} } \right)} \right\} \)\({\mu }_{k}\) is the mean vector; \({\sum }_{k}\) is the covariance matrix.

  1. 3.

    The Log Likelihood (LLH) is calculated for the chosen system of Gaussian functions using the following formula:

    $$LLH=\sum_{j=1}^{M}\mathrm{ln}\left\{\sum_{i=1}^{N}{\pi }_{i}\times \mathcal{N}\left({x}_{j}|{\mu }_{i},{\sum }_{i}\right)\right\}$$
    (5)
  2. 4.

    The Gaussian parameters are updated using the Expectation Maximization Algorithm as follows:

    $${\pi }_{k}^{(z+1)}= \frac{1}{n}\sum_{j=i}^{M}P\left({x}_{j}|{\pi }_{k}^{(z)}, {\mu }_{k}^{(z)}, { \sum }_{k}^{(z)}\right)$$
    (6)
    $${\mu }_{k}^{(z+1)}= \frac{\sum_{j=1}^{M}\left\{P\left({x}_{j}|{\pi }_{k}^{(z)}, {\mu }_{k}^{(z)}, { \sum }_{k}^{(z)}\right)\times {x}_{j}\right\}}{\sum_{j=i}^{M}P\left({x}_{j}|{\pi }_{k}^{(z)}, {\mu }_{k}^{(z)}, { \sum }_{k}^{(z)}\right)}$$
    (7)
    $$\begin{aligned}&{\sum }_{k}^{(z+1)} \nonumber \\ &= \frac{\sum_{j=1}^{M}\left\{P\left({x}_{j}|{\pi }_{k}^{(z)}, {\mu }_{k}^{(z)}, { \sum }_{k}^{(z)}\right)\times {\left({x}_{j}-{\pi }_{k}^{(z)} \right)}^{T}\left({x}_{j}-{\pi }_{k}^{(z)}\right)\right\}}{\sum_{j=i}^{M}P\left({x}_{j}|{\pi }_{k}^{(z)}, {\mu }_{k}^{(z)}, { \sum }_{k}^{(z)}\right)}\end{aligned}$$
    (8)

where z = iteration number.

  1. 5.

    The steps 2–4 are repeated until the log-likelihood values converge. [43, 44]

The training data set is used to establish the optimal number of classes for clustering the unlabelled data. After determining the number of Gaussians, the test data is utilised to estimate the severity of corrosion. The classification accuracy is represented by a confusion matrix. The determination of number of classes are covered in the next section.

7.2 Determination of the Optimal Number of Gaussians Clusters

In this work, the initial training data set consists of features extracted from rebar \({X}_{1}^{1}\) images across all corrosion stages. The training data is classified into 5, 4, 3 and 2 clusters. Corresponding to each case, the Bayesian information criterion (BIC) is calculated according to Eq. (9) [33]:

$$BIC= -2 LLH+p\times \mathrm{ln}(T)$$
(9)

where, \(LLH\) = Converged value of likelihood, \(T\) is the total number of observations, and \(p\) = 3*r, which corresponds to total number of parameters estimated (e.g., for 5 clusters p = 15, since for each Gaussian cluster, there are 3 parameters, i.e., r = 3). The BIC values are plotted as a function of the number of clusters and the ‘elbow’ point in the curve gives the optimal number of classes or clusters. For a chosen number of clusters, the Gaussian parameters are updated according to Step 4 described in the previous subsection and the LLH value is tracked, and it starts to converge. As a demonstration, the variation of LLH as a function of number of iterations for k = 3 is shown in Fig. 14e, in which the 3 cluster GMM model is found to converge after 22 iterations. As observed in Fig. 14f, the reduction in the BIC value is not significant beyond three clusters, which indicates that 3 clusters are adequate for classification of the training data set [33].

Fig. 14
figure 14

Gaussian cluster functions for the training data set from rebar \({X}_{1}^{1}\) with a k = 2; b k = 3; c k = 4; d k = 5; e variation of LLH for k = 3; f BIC plot

A similar approach is adopted for determining the number of classes for the training data set extracted from rebar \({X}_{4}^{1}\) and the Gaussian parameters are compared with those corresponding to \({X}_{1}^{1}.\) Three optimal clusters are identified in this case as well and a comparison of the optimal Gaussian parameters (\(\mu ,\sum \)) for the two data sets is shown in Table 3.

Table 3 Comparison of Gaussian parameters

Since the parameters are similar for both the cases, the data sets from \({X}_{1}^{1}\) and \({X}_{4}^{1}\) are combined to generate a new training data set and the number of optimal Gaussian clusters and corresponding parameters are estimated for corrosion severity classification.

Figure 15e shows the LLH variation for the 3 clusters for k = 3 and the model is observed to converge after 50 iterations. With the combined training data set, it is observed in Fig. 15f that the BIC value does not reduce significantly beyond three clusters. Therefore, the adequacy of three clusters for classification is established.

Fig. 15
figure 15

Gaussian cluster functions for the training data set the combined training data set from rebars \({X}_{1}^{1}\) and \({X}_{4}^{1}\) for a k = 2; b k = 3; c k = 4; d k = 5; e variation of LLH for k = 3; f BIC plot

7.3 Corrosion Severity Classification Based on SAFT Images

Based on the developments in the previous subsection, the classifier is tested with different sets of test data. The results are shown in this section.

7.4 Classification of 16 mm Rebars

The GMM classification model is developed, and the three classes are labelled as "low level corrosion", "medium level corrosion", and "high level corrosion", as shown in Fig. 16, based on the developments in Sects. 7.1 and 7.2. Figure 16 shows the confusion chart of GMM model with the features being combined from images of rebars \({X}_{1}^{1}\) and \({X}_{4}^{1}\) belonging to slab1. The confusion chart shows that 91.7% and 99.5% of the feature vector values, corresponding to pre-cracking stages I and II are associated with “low level corrosion”. With progress in corrosion 78.9% and 85.4% belong to “medium level corrosion” and in the advanced level of corrosion, almost 100% data belongs to “high level corrosion” corresponding to cracked stages I and II. The test data extracted from the SAFT images of middle rebars at various levels of corrosion are now investigated.

Fig. 16
figure 16

Confusion chart of GMM model

Figures 17a–g show the distribution of the features corresponding to rebar \({X}_{2}^{1}\) at various levels of corrosion. It is observed that the GMM correctly classifies the features as “low level corrosion”, which is consistent with the actual photographs presented in Fig. 6a that show very little corrosion damage to the rebar.

Fig. 17
figure 17

Corrosion classification of rebar \({X}_{2}^{1}\) of slab 1 in a pre-cracking stage I (PC-I); b pre-cracking stage II (PC-I); c pre-cracking stage III (PC-I); d cracked stage I (CS-I); e cracked stage II (CS-II); f cracked stage III (CS-III); g confusion chart

Similarly, for rebar \({X}_{3}^{1}\), which did not undergo significant mass loss (Fig. 6a), the features are classified as “low level corrosion” in the confusion matrix shown in Fig. 18g.

Fig. 18
figure 18

Corrosion classification of rebar \({X}_{3}^{1}\) of slab 1 in a pre-cracking stage I (PC-I); b pre-cracking stage II (PC-I); c pre-cracking stage III (PC-I); d cracked stage I (CS-I); e cracked stage II (CS-II); f cracked stage III (CS-III); g confusion chart

Figures 19 and 20 present analysis of the data from the side rebars \({X}_{1}^{2}\) and \({X}_{4}^{2}\), which underwent significant corrosion. The data had been acquired at seven stages of corrosion. For pre-cracking stages I, II and III, the data is classified as “low level corrosion” (Figs. 19 and 20a–c).

Fig. 19
figure 19

Corrosion classification of rebar \({X}_{1}^{2}\) of slab 2 in a pre-cracking stage I (PC-I); b pre-cracking stage II (PC-II); c pre-cracking stage III (PC-III); d pre-cracking stage IV (PC-IV); e cracked stage I (CS-I); f cracked stage II (CS-II); g cracked stage III (CS-III); h cracked stage IV(CS-IV); i confusion chart

Fig. 20
figure 20

Corrosion classification of rebar \({X}_{4}^{2}\) of slab 2 in a pre-cracking stage I (PC-I); b pre-cracking stage II (PC-II); c pre-cracking stage III (PC-III); d pre-cracking stage IV (PC-IV); e cracked stage I (CS-I); f cracked stage II (CS-II); g cracked stage III (CS-III); h cracked stage IV(CS-IV); i confusion chart

However, the percentage of the data points at pre-cracking stage IV associated with "low level corrosion" is 40% and 21% respectively and the majority of the data (57.5% and 79%) have migrated to the “medium corrosion level” cluster indicating progress in the corrosion relative to the cases in Figs. 19 and 20d. This stage is also the last stage before cracking of the slab and association of the features with "medium level corrosion" is therefore, correct.

In the cracked stage I (Figs. 19 and 20e), 84% and 73.5% of the data are associated with “medium level corrosion”, a higher proportion compared to the previous stage; there is also a visible shift of the data towards the “higher level of corrosion” zone. From cracked stage II, the rebar data is classified as “high level corrosion”.

The classification of the features of rebar \({X}_{2}^{2}\) and \({X}_{3}^{2}\), shown in Figs. 21a–g, follows a similar trend as the other rebar and the clustering is in the “low corrosion level” zone (92% or above) across different stages. This complies with the low level of mass loss observed in Fig. 6b.

Fig. 21
figure 21

Corrosion classification of rebar \({X}_{2}^{2}\) of slab 2 in a pre-cracking stage I; b pre-cracking stage II; c pre-cracking stage III; d pre-cracking stage IV; e cracked stage I; f cracked stage II; g cracked stage III; h cracked stage IV; i confusion chart for rebar \({X}_{2}^{2}\); j confusion chart for rebar \({X}_{3}^{2}\)

The confusion charts of rebars from slab 3 are shown in Fig. 22. The rebar signatures in the SAFT images do not change significantly across various levels of corrosion and the extracted rebars were found not to have undergone mass loss (Fig. 6c), possibly due to a faulty electrical connection. The GMM algorithm associated the majority of the data with “low level corrosion”, which is expected.

Fig. 22
figure 22

Confusion charts for rebar \({X}_{1}^{3}\), \({X}_{2}^{3}\), \({X}_{3}^{3}\), and \({X}_{4}^{3}\) of slab 3

To summarize, the implemented GMM based classifier algorithm successfully identifies the level of corrosion severity in three different slabs. This approach, therefore, has a good potential of providing useful insight regarding the prevailing condition of rebar corrosion inside the concrete structure, even when it is not apparent. Therefore, the use of unsupervised statistical learning algorithms like GMM can be a tool for directing repair and rehabilitation works.

7.5 Classification of 12 mm Rebar Data

This section discusses about the feature vectors extracted from SAFT images of the 12 mm rebars. The range of the feature values are different and therefore the classifier model for the 16 mm rebars would not work in this case and a dedicated GMM classifier should be developed. Figures 23 and 24 shows the feature data extracted from the images of the side rebars \({Y}_{1}^{1}\) and \({Y}_{1}^{2}\) at different corrosion stages.

Fig. 23
figure 23

The rebar signature extracted from SAFT images of rebar \({Y}_{1}^{1}\) of slab 1 at a pre-cracking stage I; b pre-cracking stage II; c pre-cracking stage III; d cracked stage I; e cracked stage II; f cracked stage III

Fig. 24
figure 24

The rebar signature extracted from SAFT images of rebar \({Y}_{1}^{2}\) of slab 2 at a pre-cracking stage I; b pre-cracking stage III; c cracked stage I; d cracked stage IV

The 12 mm rebars did not undergo substantial corrosion (except at the ends), as observed in Figs. 7a–c, due to a lower exposure level to the NaCl solution. The centroid of data points in Figs. 23 and 24 at various levels of corrosion do not migrate significantly. Therefore, the classification of corrosion severity of 12 mm rebars into different clusters is not performed.

To summarize, the experimental waveforms are normalised with respect to the initial surface wave arrival to eliminate variation in amplitudes arising out of coupling issues. Best practices have been followed based on experience of the investigators, since the effect of coupling on the waveforms is a complex phenomenon that cannot be fully eliminated in contact-based measurements. Regarding other sources of noise in the signals, the acquisition involves averaging and the SAFT algorithm incorporates a summation process which would lead to lower levels. Future research will involve applications of the proposed methodology on data acquired through non-contact sensing platforms (e.g., GPR) and on real structural systems.

8 Conclusions

The article presents classification of the level of corrosion of rebars and associated corrosion severity through implementation of an unsupervised GMM based algorithm, utilizing simple features extracted from SAFT images of rebars. As observed in the rebar images, the pixel intensities diminish with progress in corrosion, which is consistent with research previously published by the authors. A novel approach of pattern recognition is implemented by inputting the feature vectors (maximum and minimum amplitudes of rebar signatures) into a GMM based classifier, which identifies cluster boundaries for three levels of corrosion severity. The results from the test data are consistent with physical observations of corrosion damage to the extracted rebars. The proposed technique, therefore, has a strong potential of providing useful inputs for pre-emptive maintenance and efficient management of concrete structural assets. This work opens many new avenues and future research will involve investigations with data obtained from real life structures.

The proposed technique requires prior information of rebar locations, which may be found out from existing drawings and/or by rebar locator or GPR based survey. In fact, the proposed technique can be used along with the GPR survey and therefore provide complimentary information regarding the status of the rebar. Moreover, the proposed technique is a contact-based and therefore require more time and is suitable for localized NDT operations.