1 Introduction

Fiber-reinforced polymers (FRPs) are composite materials comprised of high-strength fibers, such as glass FRP (GFRP), carbon FRP (CFRP), aramid FRP (AFRP), and basalt FRP (BFRP) embedded within a polymer matrix (list of abbreviation presented in Table 1). These materials possess exceptional mechanical properties, including high strength, stiffness, and corrosion resistance, making them valuable in various civil engineering applications. In reinforced concrete (RC) structures, FRPs find widespread use due to their advantageous characteristics. FRP materials are employed to reinforce and retrofit existing RC structures. They can be externally bonded to the surface of RC members, such as beams, columns, joints, and slabs, to enhance their load-bearing capacity and structural performance. This method is often preferred due to the lightweight nature of FRPs, ease of installation, and minimal disruption to the structure during retrofitting [1].

Table 1 List of abbreviations used in this research

Nowadays, various types of FRP materials have been used in RC structures to enhance the mechanical attributes and performance of buildings and infrastructures. In recent years, retrofitting schemes have been expanded to numerous possible materials and procedures to empower the performance of buildings under external loads such as wind and seismic loads. FRPs are also utilized in new construction projects as internal reinforcement in concrete elements, replacing or complementing traditional steel reinforcing bars, and a high strength-to-weight ratio contributes to lighter and more durable structures. FRP materials have shown effectiveness in strengthening buildings against seismic forces since applying FRP sheets or wraps to critical structural elements can significantly improve the seismic performance of buildings [2]. Unlike steel, FRPs do not corrode when exposed to harsh environmental conditions or chemicals, making them a preferred choice in corrosive environments [3]. The use of FRPs in these applications offers several advantages; however, challenges for long-term durability, insufficient case studies and applications, and cost ineffectiveness need continuous research and development to enhance their widespread adoption in structural engineering further. Researchers continually explore and refine the application of FRPs in structural engineering, focusing on optimizing their performance, developing standardized design guidelines, and investigating innovative methods to maximize their efficiency and longevity in various structural applications.

Figure 1 presents the implementation of FRPs on the bottom of the beams to retrofit an RC building. FRPs are mainly used for RC structural members to improve their bending and axial capacities without increasing the total weight of the building, which is the benefit of FRP material. The way of applying FRP is different due to the structural element, and it can be applied in fully wrapped, U-wrapped, face plies, strips, and circle round.

Fig. 1
figure 1

Implementing FRPs on the bottom of the beams to retrofit an RC building

Figure 2 illustrates different implementations of FRPs for different purposes of retrofitting the RC structure. Therefore, the primary way of transmitting the force has a significant influence on the selection of the wrapping type of FRP. The unit weight of the FRP can be changed between 200 to 915 g/m2, and the thickness of the FRP material is between 1.4 and 0.111 mm, which shows the high strength of FRP materials compared to their weight. This makes FRP material very useful for retrofitting procedures since they do not add any extra gravity load to the RC structure. Table 2 presents the mechanical properties of FRPs from the literature review [4, 5]. Since the application of FRP varies in RC structural members due to the need to increase the load-carrying capacity, this study aims to provide a comprehensive investigation to present machine learning (ML) algorithms and their ability to estimate the mechanical properties of FRP. This can significantly help engineers to improve the design process and reduce computational time. For this purpose, Fig. 3 illustrates the architecture of the current study and various types of FRPs applied to RC structural elements. In each section, the methods that have been used for prediction models have been explained and discussed.

Fig. 2
figure 2

Implementation of FRPs for different purposes to retrofit RC structure

Table 2 Mechanical properties of FRPs that have been used for research studies [4]
Fig. 3
figure 3

Architecture of FRP and related keywords discussed in this study

This study endeavors to conduct an extensive examination of ML methods utilized in predicting the mechanical properties of structural RC members reinforced with FRPs, acknowledging their diverse applications. The primary objective is to identify existing gaps within these methods and propose recommendations aimed at enhancing their predictive accuracy and applicability in assessing the mechanical characteristics of FRP elements. This study gets into an in-depth exploration of novel ML techniques applicable for predicting the mechanical properties of FRPs applied in different RC structural members. It not only evaluates the existing ML methodologies but also introduces and discusses newly emerging approaches that hold promise in accurately forecasting the mechanical behaviors of FRP elements. To this end, the Web of Science database was thoroughly examined to identify relevant literature specifically pertaining to the topic of study, focusing exclusively on high-ranking papers published in reputable journals from 2008 to 2024. Nevertheless, the authors also utilized Google Scholar to source additional papers not included in the Web of Science database. Subsequently, only those articles that were closely aligned with the research topic were selected for a detailed examination to ensure a comprehensive literature review. Moreover, this investigation offers a set of recommendations tailored to guide future research endeavors in this domain, intending to augment the precision and applicability of predictive models for FRP mechanical properties. These recommendations aim to serve as a roadmap for researchers, outlining potential avenues for refining and expanding the scope of ML applications in predicting FRP characteristics.

2 ML Algorithms

Artificial intelligence (AI) refers to the simulation of human intelligence processes by machines, encompassing tasks such as problem-solving, learning, reasoning, and decision-making. However, ML is a subset that focuses on developing algorithms enabling systems to learn and improve from experience automatically. ML techniques enable machines to identify patterns in data, learn from examples, and make predictions or decisions without explicit programming. Deep learning, in turn, is a specialized subset of ML that involves neural networks with multiple layers capable of learning representations of data through abstraction. Figure 4 illustrates the architecture of ML algorithms that can be used in ML models for different purposes. As presented, three main groups of supervised, semi-supervised, and unsupervised come from the type of dataset that is used for training and testing models. In supervised learning, algorithms are trained on labeled data, where the algorithm learns the mapping between input and output variables. The input features are derived from a prepared dataset with their significance in the ML model determined through various techniques. As a result, a comprehensive list of input features that have been investigated within the ML models presented in Table 1. These parameters are utilized in summary tables to illustrate the inputs used by ML models for predictive estimation. It aims to predict or classify future instances based on previous examples. On the other hand, unsupervised involves training models on unlabeled data, where the algorithm identifies patterns or structures within the data. Unsupervised learning includes clustering and association techniques and is used for tasks like grouping similar data points or discovering hidden patterns. Accordingly, semi-supervised stands for tasks between supervised and unsupervised learning. This research mainly investigates supervised ML algorithms for regression-based datasets.

Fig. 4
figure 4

Comprehensive architecture of ML algorithms

Since different types of engineering problems may be selected for ML algorithms, preparing an ML model for this problem may need concise steps for achieving targets. Figure 5 illustrates the main ten steps that are typically used for ML models to achieve higher prediction accuracy. These steps constitute a generalized process; however, the specific implementation might vary depending on the issue and nature of dataset [6].

Fig. 5
figure 5

Formal steps of preparing an ML-based prediction model

2.1 Model Metrics

Researchers have utilized a range of techniques termed “model metrics” or “statistical indicators” to assess the performance of ML methods, delineating the adequacy of model fitting, which works with error values. These metrics elucidate the precision of predictions, essentially measuring the compatibility between predicted values and real data across diverse scenarios. Moreover, model metrics serve as benchmarks for gauging the effectiveness of different ML techniques. Table 3 enumerates the established statistical indicators along with their respective formulas [7, 8].

Table 3 Statistical indicators are assumed to evaluate the capability of ML algorithms

2.2 Resampling of Dataset

Resampling methods are essential for creating new datasets from existing samples for pre-evaluating the capability of the ML models in categories of parametric and nonparametric methods. Nonparametric techniques like bootstrapping and permutation testing do not depend on assumptions about the underlying data distribution, making them ideal for unknown or un-assumed population distributions. Bootstrapping involves repeated random sampling with replacement from the original sample, creating new samples to estimate statistical variability or form confidence intervals. However, permutation testing shuffles data labels to compute statistics, generating null distributions for hypothesis testing and population inferences [9].

In contrast, parametric resampling methods such as jackknifing and cross-validation (CV) assume a known or assumable data distribution. Jackknifing excludes one observation at a time to form subsamples, estimating statistic bias and variance. CV partitions data into training and testing sets, fitting a model to the training set and evaluating its performance on the test set iteratively [10]. CV encompasses various techniques like k-fold, repeated k-fold, leave-one-out, random permutation, and group k-fold CVs. The k-fold technique splits data into equal folds, training the model on k-1 folds and evaluating the remaining fold iteratively. Figure 6 presents a fivefold CV method that can be used for resampling the dataset. Repeated k-fold CV repeats the process r times for more reliable performance estimation. The leave-one-out method uses each sample for validation and others for training, providing an unbiased performance estimate. Random permutation CV shuffles and splits data for multiple performance score iterations. Group k-fold ensures clusters or correlated groups remain together, preventing model evaluation on training data [11].

Fig. 6
figure 6

Illustration of the fivefold CV method

2.3 Feature Selection Methodology

Feature selection methods, crucial for identifying pertinent features in a dataset, fall into three categories: wrapper, embedded, and filter techniques (see Fig. 7) [12]. Embedded methods, like most minor absolute shrinkage and selection operator (LASSO) regression and ridge regression (RR), embed a penalty term to minimize coefficients of less relevant features, effectively zeroing them out. Conversely, wrapper methods like recursive feature elimination and forward/backward selection train models on feature subsets and validate model performance as the criterion for selection, a computationally intense but precise approach. In contrast, filter methods such as correlation-based and mutual information-based selections employ statistical measures to score each feature and pick the most relevant ones for a given algorithm. Using these three feature selection methods can widely empower the ability of the ML models and provide them with a tool for regulating the fluctuations of predictions.

Fig. 7
figure 7

Illustration of feature selection methods

2.4 Hyperparameter Tuning Methods

Hyperparameter tuning in ML involves techniques like grid search, random search, halving search, and fine-tuning (FT). Grid search assesses the best hyperparameter combination from predefined sets by training the model on all possible combinations as an effective yet computationally intensive method. In random search, sets of hyperparameters are randomly chosen and trained to find the best set, which is suitable for high-dimensional spaces and more efficient than grid search. Halving search trains numerous candidate models on subsets of data with various hyperparameters, discarding the worst models and refining the better ones iteratively, which is ideal for large datasets and models [13]. Hyperparameter FT optimizes ML model parameters like learning rates, network layers, regularization strength, and kernel types to enhance performance on specific tasks or datasets. Figure 8 presents the illustration of hyperparameter optimization methodology. Achieving this involves exhaustive experimentation with different hyperparameter combinations, demanding careful selection of tuned hyperparameters and search algorithms to ensure efficient and effective fine-tuning [14].

Fig. 8
figure 8

Illustration of hyperparameter optimization methodology

Sequential model-based optimization (SMBO) is a formalized approach within Bayesian optimization that excels in efficiently identifying optimal hyperparameters for ML models compared to some other optimization techniques. Utilizing Bayesian reasoning and probabilistic models, SMBO systematically selects hyperparameter configurations to evaluate based on past observations, focusing on promising regions in the hyperparameter space. By iteratively adjusting the search space and exploiting information from previous evaluations, SMBO makes informed decisions, aiming to converge on the most effective hyperparameters while minimizing computational expense. The efficiency of this method lies in its ability to intelligently navigate the parameter space, progressively exploring and exploiting valuable insights gained from previous evaluations to expedite the discovery of optimal hyperparameter configurations. The optimization algorithm relies on SMBO methodology, employing diverse variants like Gaussian process regression (GPR), tree-structured Parzen estimator (TPE), and its adaptive extension, Adaptive TPE (ATPE). These variants serve as distinct strategies within the SMBO framework, each offering specific approaches to search and exploit the hyperparameter space. GPR leverages probabilistic models to approximate the objective function and guide the search. At the same time, TPE and ATPE use probabilistic modeling in a tree-structured way to explore and exploit hyperparameter configurations efficiently. Each variant, GPR, TPE, and ATPE, brings its unique strengths to the optimization process, contributing distinct methodologies for navigating the hyperparameter landscape and finding optimal configurations for ML models.

2.5 ML Algorithms

This section briefly examines some widely-used ML algorithms, commonly applied in engineering and recognized as conventional ML methods. The decision tree (DT) algorithm, notably extended to the random forest (RF) variant as depicted in Fig. 9, serves as a foundational model for numerous advanced techniques. Its evolution over recent years has led to broad application across various engineering disciplines [15].

Fig. 9
figure 9

Random forest base estimator of ML algorithms

Another important ML algorithm that has been developed based on the DT method is the gradient boosting machine (GBM). GBM is an ensemble learning method used for both regression and classification tasks. It operates by combining multiple weak predictive models into a strong one. Figure 10 presents a schematic view of the GBM method. The boosting ability of GBM makes it a comparative method since its capability has been significantly improved by a combination of different weak learners. This can be used to fit new models to improve upon the errors made by the existing ones, gradually reducing the overall error. Therefore, unlike RF, which builds trees independently, GBM builds trees sequentially. Each new tree learns from the mistakes or residuals of the preceding trees. Moreover, GBM uses gradient descent optimization to minimize the loss function, which aims to find the direction of the steepest descent to reach the minimum loss or error. Meanwhile, the GBM method can be prone to overfitting as it sequentially minimizes errors. Hence, it might fit too closely to the training data [16].

Fig. 10
figure 10

Weak learners and added trees in the GBM algorithm

Extreme gradient boosting (XGBoost) is an optimized and efficient implementation of the GBM method since it integrates additional regularization techniques to control overfitting more effectively than traditional GBM. In other words, the approach incorporates L1 (LASSO) and L2 (Ridge) regularization terms within the objective function, thereby enhancing the generalization of the model. In addition, XGBoost is designed for speed and efficiency by parallelizing tree building, handling sparse data efficiently, and reducing computational time. Since XGBoost has an in-built capability to handle missing values, it automatically learns the optimal direction for missing value imputation during training. Figure 11 presents a schematic view of the XGBoost method. XGBoost supports built-in cross-validation, allowing for model evaluation during the training phase that helps in better hyperparameter tuning and model selection [17].

Fig. 11
figure 11

The schematic view of tree building in the XGBoost algorithm

Figure 12 illustrates a schematic view of the difference between the three algorithms of XGBoost, LGBM, and CatBoost. CatBoost stands out for its ability to handle categorical variables without the need for extensive pre-processing, as it internally encodes categorical features, reducing the risk of data leakage. This approach encompasses regularization techniques that mitigate overfitting and enhance generalization to novel, unseen data. Additionally, it includes a built-in feature importance mechanism, enabling users to grasp the significance of individual features in the predictions made by the model. Therefore, in a model package, CatBoost can handle missing data by using a symmetric decision tree structure, allowing for more accurate handling of missing values. LightGBM (LGBM) is known for its efficiency in training speed and memory usage since it can use only the way of the tree with higher predictions. It utilizes a histogram-based approach to build decision trees, reducing computational resources and making it suitable for large datasets. Furthermore, it employs a leaf-wise growth strategy instead of level-wise, making the trees deeper and generally leading to higher accuracy. The main difference between LGBM and CatBoost stems from the way they are handling the variables. CatBoost inherently handles categorical variables more effectively compared to LGBM, while LGBM is generally more efficient in terms of speed and memory usage, especially for large datasets. The performance may vary depending on the dataset, and sometimes CatBoost might provide better results, especially when categorical features are prominent [16, 18].

Fig. 12
figure 12

The schematic view of the difference between the three algorithms of XGBoost, LGBM, and CatBoost

Performing both linear and non-linear classification, the support vector machine (SVM) is able to find the optimal hyperplane or decision boundary that maximizes the margin between classes in the input space. Figure 13 presents the linear two-dimensional space of SVM. This method is efficient even in high-dimensional spaces and is effective when the number of features is larger than the number of samples. In addition, using kernel functions (e.g., linear, polynomial, radial basis function) to transform the input space into a higher-dimensional space allows for handling complex relationships between data points [19].

Fig. 13
figure 13

Linear two-dimensional space of SVM

The idea of using them for engineering problems stems from the definition of engineering problems in a way that the ML can make some predictions on the target. This can help structural designers find out some essential estimations that need to be done using some complicated modeling procedures. The most commonly used algorithm comprises artificial neural networks (ANNs), which are a class of computational models inspired by the neural structure of the human brain [20, 21]. Figure 14 illustrates the ANNs having feed-forward and backward propagation ability. Among ANNs, the multilayer perceptron (MLP) stands as a fundamental and widely utilized architecture. Comprising interconnected nodes organized into layers, an MLP incorporates an input layer, one or more hidden layers, and an output layer. Figure 15 illustrates the MLP algorithm. Each node, or neuron, in an MLP, performs weighted computations on input signals, then passes the result through an activation function, often a non-linear function like the sigmoid or ReLU (Rectified Linear Unit). These weighted computations allow ANNs to capture intricate relationships within complex datasets and exhibit superior modeling capabilities for non-linear problems. MLPs undergo training via backpropagation, a process that adjusts internal parameters of the network, such as weights and biases, to reduce the discrepancy between predicted and actual outputs. Typically, this optimization objective is achieved using various algorithms, including gradient descent. The architecture of MLPs facilitates the learning of data representations, which progressively become more abstract in successive layers of the network. This property enables MLPs to perform diverse tasks, including classification, regression, and pattern recognition. However, MLPs are sensitive to hyperparameters like the number of layers, neurons, and the selection of activation functions, necessitating careful tuning for optimal performance. MLPs and ANNs flexibility, capacity to model complex relationships, and ability to approximate intricate functions make them a perfect ML model in various domains especially for estimating mechanical properties of FRP implemented in structural members. Despite their effectiveness, they can suffer from overfitting, requiring strategies like regularization and cross-validation to mitigate this issue and generalize well to unseen data [22,23,24].

Fig. 14
figure 14

Illustration of feed-forward and feed-backward propagation in ANNs

Fig. 15
figure 15

Illustration of the Multilayer perceptron algorithm

Recurrent neural networks (RNNs) are a type of ANN designed to handle sequential data by preserving information through cycles within the network. They possess internal memory to process sequences of variable lengths, making them suitable for tasks involving time-series data, natural language processing, and speech recognition. The architecture of the network facilitates feedback loops, which allow information to persist and subsequently influence future predictions [25, 26]. Convolutional neural networks (CNNs) are specialized ANNs designed for processing grid-like data, such as images or videos. They utilize convolutional layers to extract relevant features from input data by applying filters or kernels. These layers learn hierarchies of patterns, enabling the network to recognize complex patterns and structures within the input, making CNNs highly effective in computer vision tasks like image classification and object detection [27].

2.5.1 Other Data-Driven Techniques

Although some well-known ML models have been used widely for decades, some other ML algorithms improved their ability and can be used for estimating different engineering problems. In this section, some of them have been discussed. The k-nearest neighbor (KNN) method is versatile and applicable to both classification and regression problems by leveraging “feature similarity” to predict future data point values, assuming how closely a new point aligns with those in the training set. In the feature space, observations with similar characteristics are positioned close together, influencing their proximity in the output space. By utilizing the response values of the nearest neighbors, the method employs a predetermined function to forecast output values [8].

An ensemble learning technique known as the voting regressor (VR) amalgamates various models to enhance predictive performance by aggregating predictions from multiple individual regression models. This method computes the average, weighted average, or median values of these predictions, depending on the assigned weights for each model. Alternatively, the stacking regressor (SR) leverages a meta-regressor to combine forecasts from individual models, resulting in improved predictive accuracy [10]. Linear regression (LR) is a fundamental and widely applied algorithm that establishes a linear relationship between input and output variables, aiming to minimize the disparity between predicted and actual values. Employing the maximum likelihood estimation of gamma distribution parameters, the gamma regressor (GR) is rooted in estimating parameters for the gamma distribution. Generalized linear regression (GLR) extends LR by allowing the response variable to follow a non-normal distribution. It utilizes a link function for transformation, thereby facilitating the application of LR models. GPR, a nonparametric algorithm, treats the distribution of the target variable as a Gaussian process. This approach enables the prediction of new data points through the computation of their posterior distribution. This approach offers probabilistic predictions, which are valuable for determining confidence intervals and assessing prediction uncertainty. The partial least squares (PLS) regression technique conducts multivariate data analysis by deriving a linear combination of input variables that accounts for maximum variance in the output variable [14].

Bootstrap aggregating, also known as bagging regressor (BR), constructs multiple models from distinct bootstrap samples of training data, empowering several weak learners to collaborate effectively as a single strong learner. The multitask LASSO, a regularization approach for LR, applies the LASSO penalty to the sum of coefficients across all output variables, identifying the most impactful input parameters for predicting all output variables simultaneously. Stochastic gradient descent (SGD) optimizes ML models iteratively, updating model parameters based on the loss function gradient concerning the parameters. This algorithm efficiently updates parameters using subsets of the training data per iteration, which is particularly beneficial for large datasets [16]. Kernel ridge (KR) regression, a nonparametric method, employs kernel functions to map input variables to a high-dimensional feature space while preventing overfitting through regularization. Bayesian ridge regression (BRR) incorporates a Bayesian prior on regression coefficients, which is particularly useful when the number of input variables is substantial relative to training samples. Data classification in supervised ML entails a training dataset obtained from varied experiments, encompassing descriptive characteristics and target features. Descriptive characteristics significantly influence the target feature outcome. The Naïve Bayes technique, a straightforward yet potent probabilistic classification method, excels when handling extensive data volumes [19].

K-means clustering is an unsupervised ML algorithm used for clustering data points into groups or clusters based on their similarities. It partitions the data into k clusters by iteratively assigning data points to the nearest cluster centroid and updating the centroid position until convergence. It aims to minimize the intra-cluster distance and maximize the inter-cluster distance, assigning each data point to the cluster whose centroid is closest to it [28]. Principal component analysis (PCA) is a dimensionality reduction technique that transforms high-dimensional data into a lower-dimensional space while preserving as much variance as possible. It achieves this by identifying and projecting data onto orthogonal axes, known as principal components, ordered by the variance they capture. These components represent a new set of uncorrelated variables that condense the information from the original variables, making it easier to visualize and analyze the data. Some novel investigations used this approach to predict different types of engineering demands, such as damage assessment in GFRP [29,30,31]. Logistic regression is a statistical model used for binary classification tasks. It estimates the probability that a given input belongs to a particular category by fitting a logistic function to the input features. It calculates the likelihood of the observed outcomes using the logistic function and a linear combination of input features, determining the decision boundary between the classes [32].

3 ML Predictions for Various Properties of FRP

3.1 FRP Used in Beam

Numerous studies have been focused on the mechanical properties of retrofitted concrete beams. Deifalla and Salem [33] focused on the introduction of eleven ML models aimed at accurately computing the ultimate torsion strength of concrete beams that have been reinforced through the application of externally bonded FRP. Among the models assessed, the vast neural network emerged as the most promising, yielding highly accurate outcomes. It demonstrated a commendable R2 = 0.93, RMSE = 1.66, MAE = 0.98, average safety factor of 1.11, and coefficient of variation of 45%, affirming its robustness and precision in this predictive context. The accurate analysis and design of structural members necessitate precise models for determining the overall shear strength of elements reinforced with FRP sheets. Anvari et al. [34] demonstrated the successful application of gene expression programming (GEP) in constructing predictive models for estimating shear strength of RC beams. Leveraging a dataset comprising 785 RC beams reinforced with externally bonded FRP sheets, two data-driven models (i.e., GEP I and II) were developed. The calculated R2 for these models were found to be 0.883 and 0.940, respectively, attesting to their efficacy in estimating shear strength in FRP-strengthened beams. The objective of Gasser et al. [35] is to establish a dependable ML model tailored explicitly for RC beams strengthened in shear through the external application of FRP sheets. Their result confirmed the ability of ANNs to estimate the shear strength. Given the intricate and nonlinear relationship between plate end (PE) debonding and associated parameters, Hu and Li [36] employed various ML algorithms, such as LR, RR, DT, RF, and ANNs enhanced through the sparrow search algorithm. These algorithms are established to predict the PE debonding in RC beams reinforced with FRP. The XGBoost, an ensemble-learning algorithm that combines the predictions of multiple decision trees to produce a more accurate prediction, is the best ML method used by Zhang et al. [37]. The authors found that the XGBoost model outperformed the other ensemble learning-based models and the single ML-based models on all evaluation metrics. It was also found that the XGBoost model was able to accurately estimate the flexural capacity of FRP-strengthened beams for a wide range of input parameters.

Beljkaš and Baša [38] developed a database of experimental results for deflections of GFRP-reinforced continuous beams. They used this database to train and evaluate different ANN models. It was found that the best-performing ANN model was able to predict deflections with an MAE of 9.0%. This result is significantly more accurate than the predictions obtained using current standards. ANNs are a promising tool for the deflection prediction of GFRP-reinforced continuous beams. The authors’ work has demonstrated the potential of ANNs to achieve high accuracy and robustness in predicting deflections of these types of beams, which makes ANNs a valuable tool for engineers who design and build GFRP-reinforced continuous beams. Perera et al. [39] developed an innovative approach that integrates the electro-mechanical impedance (EMI) technique with multilevel hierarchical machine-learning (MHL) and fiber Bragg grating (FBG) temperature. They used this approach to evaluate the mechanical characteristics of strengthened beams with near surface mounted (NSM) FRP including load and temperature. The authors validated the proposed approach by testing to measure the temperature and strain of an RC beam over 1.5 years under different conditions of load and temperature. Guo et al. [40] proposed an integrated model based on ML techniques to assess the moment capacity and ductility of beams with T-sections. The authors demonstrated that the proposed integrated model based on the genetic algorithm (GA), ANN, and SVM could produce highly accurate predictions. Finally, the authors used two numerical examples to demonstrate model robustness in optimization of with rectangular and T-section beams. Yaseen [41] examined the ability of three ML models of M5-Tree, extreme learning machine (ELM), and RF to predict the shear capacity of 112 shear tests of FRP-RC beams with transverse reinforcement. Correlation analysis has been done to identify the most suitable input parameters for shear strength prediction. The performance of the proposed models was evaluated using statistical evaluation and graphical approaches.

Hu et al. [42] examine PE and intermediate crack (IC) debonding in FRP-strengthened RC beams. It establishes an indicator system based on literature and analyses data from 188 beams to predict these failure modes. Using six ML algorithms, the study constructs prediction models, finding that the DT demonstrates robustness while the RF achieves the highest precision in predicting the failure modes. Hu et al. [43] gathered 229 instances of debonding failure, split into 128 cases of PE debonding and 101 cases of IC debonding. They utilized correlation and grey correlation analysis to create predictive indicator systems for both types of debonding, identifying crucial indicators. In their investigation, five ML models were employed to construct prediction models, with optimization achieved through the dung beetle optimizer (DBO) algorithm, known for its competitive performance. They showed that the BP-ANNs emerged as the most effective in predicting both PE and IC debonding compared to other ML methods, and the DBO-optimized model had notably higher accuracy and was better in the prediction of PE and IC debonding. Contrarily, code-based models exhibited a substantial coefficient of variation, tending toward conservative estimations or overestimations when forecasting the state of strengthened beams upon failure.

Wakjira et al. [44] proposed the use of ML techniques for predicting the shear capacity of FRP-RC beams and utilized a dataset of 427 shear-critical FRP-RC beams. In their study, eleven ML methods, ranging from simple white-box models to advanced black-box models, were constructed using an extensive database of FRP-RC beams. According to their results, the XGBoost model was found to have the best performance with the lowest error and highest R2, outperforming existing code and guideline equations in predicting shear capacity. Moreover, they developed a super-learner ML model to predict the flexural capacity of FRP-RC beams accurately. They trained a database of 132 flexural tests and evaluated it against a range of other ML models to accurately estimate flexural capacity of FRP-RC beams [45]. Abuodeh et al. [46] investigated the use of ML to analyze the behavior of RC beams strengthened with FRP and used a database of 120 specimens and fifteen parameters. In their research, RBP-ANN was employed as a regression tool, and the RFE algorithm and NID were used within RBP-ANN to recognize the most significant parameters for predicting FRP shear capacity. The results demonstrated that RBP-ANN outperformed the original RBP-ANN with all parameters and established predictive standards, achieving R2 and RMSE values of 0.885 and 8.1 kN, respectively. Le et al. [47] proposed an approach to predict the shear strength of FRP-RC beams with and without stirrups using an XGBoost model. The model underwent training on a dataset comprising 453 experimental examples and underwent evaluation employing standard statistical metrics. They demonstrated that the developed ML model is reliable and accurate in predicting the shear strength of FRP-RC beams including and neglecting stirrups. The potential of using ML models for engineering problems was highlighted. Kumar et al. [48] utilized ANN, artificial bee colony-ANN (ABC-ANN), and GPR techniques to investigate the FRP-concrete bond strength. The ABC-ANN hybrid method is used to optimize the ANN model for improved bond strength predictions. They collected a dataset of 744 experimental data points from the literature, covering various parameters. Their prediction performance of ANN, ABC-ANN, and GPR were compared against existing code-based methods and four other models from previous studies. They showed that ABC-ANN and GPR models achieved R-values of 0.9514 and 0.9618, respectively.

Yang et al. [49] conducted experiments with three meta-heuristic optimization algorithms: the ant lion optimizer (ALO), moth flame optimizer (MFO), and salp swarm algorithm (SSA). These algorithms were applied to optimize the hyperparameters of the RF model, aimed at predicting the punching shear strength of FRP-RC beams. The ALO-RF model, particularly with a population size of 100, demonstrated superior prediction accuracy over other models in both training and testing phases. In the testing phase, it achieved an R2, MAE, MAPE, and RMSE equal to 0.941, 52.56, 15.50, and 101.64, respectively. Moreover, the hybrid ML model, optimized by these metaheuristic algorithms, demonstrated superior predictive accuracy and error control compared to traditional models. Su et al. [50] introduced the BP-ANNs to predict the bond strength of NSM in CFRP to concrete joints. They utilized nine input parameters and a single output value, offering interpretability through the neural interpretation diagram (NID) technique. It was concluded that the BP-ANNs demonstrated strong agreement with experimental data (i.e., with R2 of 0.915 for the testing phase) with a database of 163 pull-out testing samples. While removing a non-significant feature increased efficiency, the ANN-based approach had higher accuracy, suggesting its potential as a reliable method for predicting bond strength in NSM CFRP to concrete joints, appealing to structural engineers. Rahman et al. [51] developed ten ML models using 584 databases of concrete design and FRP variations in RC rectangular and T-shaped beams derived from experimental outcomes to estimate shear capacity in FRP-strengthened RC beams. They compared different design guidelines and demonstrated that RF, CatBoost, and XGBoost offer the most precise estimations. CatBoost exhibited R2 and MAE values of 0.871 and 0.214 kN for rectangular beams and 0.899 and 0.127 kN for T-beams, respectively. Yang and Liu [52] examined four established shear design methods alongside four ML models, including LR, Decision Tree, RF, and XGBoost, to predict the shear strength of FRP-RC Beams without Stirrups. They utilized 219 concrete beams without FRP stirrups sourced from 47 investigations conducted between 1994 and 2018, all featuring rectangular cross-sections, to create data-driven models. By comparing codified methods and ML models using a substantial database, they demonstrated that ML models, particularly XGBoost, offer significantly improved accuracy over established design predictions. They enhance model performance by incorporating specific features reflecting the mechanics of FRP beams and optimizing hyperparameters through Bayesian techniques.

Ikram et al. [53] introduced the extreme learning machine network based on the chaos red fox optimization algorithm (ELM-CRFOA) to predict shear strength. The accuracy of the model is validated against existing equations, demonstrating its precision in computing the shear strength of concrete beams reinforced with FRP rebar. Additionally, a sensitivity analysis evaluates the impact of input parameters on shear strength. Naser et al. [54] proposed explainable artificial intelligence (XAI) and interpretable machine learning (IML) to predict beam moment capacity and failure tendencies. It explores various methods like feature importance, SHAP, and surrogates to demystify ML model predictions using a case study on RC beams strengthened with FRP. To provide a summary, Table 4 illustrates the ML methods and their best performances for the estimation of mechanical properties of beams retrofitted by FRP.

Table 4 Summary of ML methods used for estimation of mechanical properties of beam retrofitted by FRP

3.2 FRP Used in Column

This section investigates the proposed ML methods for estimating the mechanical properties of FRP used for columns. Arora et al. [55] implemented 14 analytical methods and one ML algorithm, including ANNs, to estimate the axial load-carrying capacity of RC columns confined with FRP. They collected 242 experimental datasets and showed that the ANN method is more accurate than analytical models, and it is faster and easier to use the model to estimate axial load-carrying capacity of RC columns confined with FRPs. In addition, Cakiroglu et al. [56] predicted the axial load-carrying capacity of FRP-RC columns by applying eight ML algorithms, including KR regression, LASSO, SVM, GBM, AdaBoost, RF, Categorical GBM, and XGBoost. They collected 117 data points from literature reviews. Xue et al. [57] presented an ML approach for predicting the lateral confinement coefficient (i.e., KS) of CFRP-wrapped RC columns using six input parameters. They trained and evaluated three different ML models, namely GP, MPMR, and DNN. The GP model outperformed the other two models, with an R2 value of 0.89, RMSE of 0.056, and NMBE of 0.001. They concluded that the GP model could be used to predict the lateral confinement coefficient of CFRP-wrapped RC columns accurately. Shin and Park [58] proposed a rapid ML tool for designing and optimizing FRP retrofit schemes for RC building structures. The tool uses a hybrid approach that combines an ANN and a GA to generate structural responses and optimize retrofit details quickly. Zhang et al. [59] proposed a ML method for estimating load capacity of steel-RC columns clad in CFRP. The model, called stacking-CRRL, is a fusion of four different ML algorithms of CatBoost, RF, RR, and LASSO. The authors first extended the sparse initial data using synthetic minority oversampling (SMOTE) and eliminated redundant features using Spearman correlation coefficients. They then compared the prediction performance of five boosting models, two bagging models, and three traditional ML models. The CatBoost, RFR, and RR models were selected as the base learners, and LASSO was chosen for the Meta learner. They found that the stacking-CRRL fusion model outperformed all of the other models tested, as well as a published prediction equation and an ABAQUS simulation.

Sayed et al. [60] evaluated the performance of ML models against existing design-oriented models. They generated ML models for predicting the axial compressive load of rectangular FRP-RC columns. Their study delved into the influential parameters and their impact on the strength, ductility, and failure mode of FRP-strengthened columns. Subsequently, it identified critical design parameters and utilized them as input features in ML modeling. They demonstrated that the ML models had a remarkable consistency with the test results included in the datasets, and the GBM exhibited more accuracy with the lowest deviation values. Li et al. [61] examined a data-driven ML model to estimate the compressive strength of GFRP-confined RC columns. They used 114 datasets and ML models, including linear and non-linear methods. Linear models employed linear and RR, while non-linear models utilized DT, RF, and BP-ANNs. Their results concluded that the BP-ANN model outperforms existing models in terms of accuracy and stability. Ma et al. [62] proposed ML algorithms to predict the axial compressive capacity of CFRP-confined concrete-filled steel tubular short columns. The dataset used for training and testing consists of 379 data points, including 271 from literature sources and 108 from experiments conducted by the authors. They performed ML using material strengths, cross-sectional areas, and cross-sectional shapes as features and axial compression bearing capacity as labels. Eight algorithms, including LR, KNN, SVM, RF, AdaBoost, GBM, XGBoost, and LGBM, were evaluated using random search CV and fivefold cross-validation. XGBoost was found to have the best prediction performance, with an R2 of 0.972. Further hyperparameter tuning using learning curves and grid search improved the performance of the XGBoost model to an R2 of 0.985. Bakouregui et al. [63] proposed a new ML model based on XGBoost to predict the load-carrying capacity of RC columns reinforced with FRP bars and used the SHapley Additive exPlanations (SHAP) framework to explain the output of the model. They compiled an experimental database comprising 283 FRP-RC columns from various literature sources and assessed the performance of the model against design codes and equations found in these sources. Their results show that the XGBoost model outperforms other methods, achieving high accuracy and efficiency with the value of R2 = 0.987. To provide a summary, Table 5 illustrates the ML methods and their best performances for the estimation of mechanical properties of columns retrofitted by FRP.

Table 5 Summary of ML methods used for estimation of mechanical properties of column retrofitted by FRP

3.3 FRP Used in Slab

Vu and Hoang [64] introduced a hybrid ML model to predict the ultimate punching shear capacity of FRP-RC slabs. The least squares SVM (LS-SVM) and firefly algorithm (FA) have been employed in their research to capture the intricate relationship between influential factors and the slab punching capacity, respectively, and facilitate the LS-SVM training process. They concluded that the performance of their proposed model had the minimum amount of RMSE compared with conventional formula-based and ANN methods. Liang et al. [65] proposed a new hybrid model, called symbolic regression, that combines modified compression field theory (SR-MCFT) to enhance its predictive performance. 154 experimental data points is used to train model and using GP to optimize a correction equation that refines the basic MCFT model. The resulting SR-MCFT model outperforms other empirical models, demonstrating its effectiveness in predicting the structural behavior of concrete slabs. Shen et al. [66] used ML algorithms to estimate punching shear strength of slabs. They collected a dataset of 121 specimens to train and evaluate several ML models, including ANN, SVM, DT, and AdaBoost, and compared these ML models with empirical models. According to their results, AdaBoost was found to be the most accurate model, with a RMSE of 29.83, MAE of 23.00, and R2 of 0.99. Pan et al. [67] developed a new ML approach to estimate punching shear strength of steel bars, and FRPs used to reinforce RC flat plates. Their approach involved using four ML regression models (i.e., RF, AdaBoost, GBM, and XGBoost) and transforming input variables to improve predictive power. They utilized and trained a collection of 505 interior flat plate examples gathered from various published sources. The researchers found that the XGBoost model outperforms other numerical equations, achieving a high mean R2 of 0.93 and a low MAPE of 0.20 for testing. Their findings suggest that ML models offer a promising alternative to currently used mechanics-based models for design practice. Almustafa and Nehdi [68] investigated the development of an ML model for predicting the maximum displacement of FRP-reinforced RC slabs subjected to blast loading. Their proposed model, which was trained on a combination of natural and synthetic data, utilized a GPR algorithm for accurate predictions. Due to limited real-world data, they employed a TGAN to generate an additional 200 synthetic data points for model training. They proved that the proposed ML approach could be a viable method for predicting structural responses under blast loading, offering designers accurate FRP retrofitting results for RC slabs in a simplified and cost-effective manner. Truong et al. [69] investigated the use of ML algorithms to predict the punching shear strength of FRP-RC slabs without shear reinforcement. A dataset of 104 experimental specimens was compiled and used to train three ML models of SVR, RF, and XGBoost. According to their results, the XGBoost model outperformed the other models and the current design codes, with R2, RMSE, MAE, and MAPE equal to 0.962, 0.061, 0.035, and 8.931%, respectively. They concluded that the XGBoost model can be used reliably and precisely to design and evaluate FRP-RC slabs. Other investigations on the effectiveness of various ML techniques in predicting the punching strength of RC slabs have been conducted by Doğan and Arslan [70]. In their study, a comprehensive dataset encompassing 141 slabs reinforced with GFRP, CFRP, and traditional steel bars was analyzed, and five ML algorithms of MLR, bagging DT, RF, SVM, and XGBoost were employed to develop prediction models. They concluded that the SVM algorithm demonstrated the best performance, achieving a prediction accuracy of R2, RMSE, and MAE equal to 96.23%, 0.19, and 0.16, respectively, for GFRP-reinforced slabs. In addition, their study revealed that building codes tend to estimate the punching strength of slabs more conservatively compared to experimental results. To provide a summary, Table 6 illustrates the ML methods and their best performances for the estimation of mechanical properties of slab retrofitted by FRP.

Table 6 Summary of ML methods used for estimation of mechanical properties of slab retrofitted by FRP

3.4 Bond Strength of FRP

Su et al. [71] proposed an ANN model that accurately identifies interfacial shear strength and slip simultaneously, highlighting MAPEs of 2.941% and 2.078%, respectively. It effectively reconstructs the debonding process between CFRP and concrete with high precision in load–displacement response and interfacial shear stress evolution. The model demonstrates robustness by predicting cases beyond the training dataset, indicating its capability to interpolate in a high-dimensional prediction space. This approach efficiently predicts crucial interface characteristics once the macroscale load–displacement curve of the specimen is known, offering a promising solution for determining interfacial cohesive parameters between FRPs and concrete. In addition, in the following paper, Su et al. [72] explored three ML approaches of MLP, SVM, and ANNs to predict interfacial bond strength between concrete and FRPs. Using datasets from single-lap shear tests on FRP laminates bonded to concrete prisms, SVM emerges as the most accurate and efficient method. SVM model demonstrates equal or superior prediction accuracy on both datasets. Additionally, partial dependence plots reveal effects between variables and interfacial bond strength. Implementing a stacking strategy further enhances prediction accuracy. Overall, the SVM approach proves feasible and effective for estimating IBS in RC structures reinforced with FRPs. Kong et al. [73] used 947 data points to introduce eight ML methods for estimating the FRP interfacial bond strength. By automating hyperparameter optimization, it overcomes the limitations of manual feature extraction in traditional ML models, reducing errors and saving time in parameter selection. Among eight ML models, CatBoost emerged as the primary model after hyperparameter optimization, demonstrating superior performance. Moreover, the hyperparameter-optimized model surpassed the initial CatBoost model, indicating the reliability of hyperparameter optimization in enhancing model accuracy. Kaveh and Khavaninzadeh [746] aimed to optimize parameters for feed-forward backpropagation and radial basis function networks used in predicting FRP strength by integrating metaheuristic algorithms. Particle swarm optimization (PSO), GA, colliding bodies optimization (CBO), and enhanced colliding bodies optimization (ECBO) algorithms are combined with ANNs. Results of estimations using 223 test data of CFRP showed that ECBO yields superior accuracy, especially when combined with feed-forward backpropagation ANNs, showing lower error percentages and increased accuracy compared to other models and algorithms.

Chen et al. [75] used a database of 520 samples and employed a GBM ensemble algorithm to develop a robust prediction model for FRP-concrete interfacial bond strength. Compared to existing models and standard ML algorithms, this model demonstrates superior accuracy, confirmed through feature importance analysis, making it a reliable tool for practical bond strength prediction. Zhang et al. [76] worked on six ML methods to provide the best estimation model for estimating the interfacial bond strength of FRP and RC structure using the 1375 test dataset. ML models outperformed 16 existing equations in predicting bond strength, with XGBoost displaying the highest accuracy and 54% lower variability than the best-performing existing equation. An ANN-based parametric study identified vital influencing parameters, resulting in a new equation enabling the interpretation of ML predictions. This novel approach combines ML and traditional physical models, offering an interpretable model for interfacial bond strength. Wang et al. [77] introduced a framework for estimating bond capacity between FRP and concrete interfaces, which is crucial in infrastructure repair and rehabilitation design. The framework utilized an equilibrium optimizer (EO) to develop a hyperparameter-free ensemble model named MNVIM. This model adaptively integrates radial basis function artificial neural networks (RBF-ANNs), and least squares support vector machines (LS-SVM) facilitated by the involvement of EO in the learning phases and in adjusting the values of combination weights. A t-test confirms the significantly enhanced prediction accuracy of MNVIM, highlighting its potential as a dependable tool for estimating bonding strength and improving safety in the design of concrete element repair and rehabilitation using FRP. Alabdullh et al. [78] aimed to estimate the bond strength between FRPL and concrete using a hybrid ensemble ML approach considering six methods of ANNs, XGBoost, GMDH, MARS, LS-SVM, and GP. The proposed method showed superior predictive accuracy compared to individual models, both training and testing datasets. Following this paper, Amin et al. [79] used the same dataset to check the performance of LGBM, XGBoost, and RF and found that LGBM is a reliable ML model with the highest accuracy of prediction. Mahmoudian et al. [80] utilized four ML models to predict flexural bond strength and failure mode of mat anchorage between concrete and sand-coated GFRP bars. Since the GA was used for optimizing the hyperparameters, XGBoost performance has been increased by 4%, from R2 of 0.90 to 0.94. Additionally, an XGBoost classification model effectively predicted failure modes, achieving 100% accuracy on test data. Jahangir and Eidgahee [81] investigated ML methodologies to estimate the bond strength using a comprehensive database comprising 656 single and double‐lap direct shear test results sourced from literature reviews that form the basis of model development. The study employed ANNs and a hybrid approach, merging ANN with ABC‐ANN. Results indicate superior accuracy of models compared to current methodologies, boasting R2 of 0.97 for ABC‐ANN and 0.93 for ANN. Moreover, the hybrid ABC‐ANN model streamlines a straightforward formulation for bond strength evaluation. Contrasting with previous studies that explored the bond strength of FRP on RC members, Palizi and Toufigh [82] concentrated on predicting the bond strength between timber and FRP. Their study, employing gene expression programming (GEP), specifically addressed different environmental conditions. Three empirical models were developed: one for standard conditions and two for acidic and alkali solutions. All three models meet statistical criteria, displaying a mean relative error below 12% and a minimum R coefficient of 0.9.

Zhang et al. [83] presented a novel approach to predict the bond strength of FRP-to-concrete joints, a crucial aspect in retrofitting RC structures. Traditional predictive models struggle due to complex relationships between bond strength and numerous variables. To address this, the study introduced a metaheuristic-optimized model using the LS-SVM algorithm. Hyperparameters of the LS-SVM are fine-tuned via a beetle antennae search (BAS) algorithm, enriched with Levy flight, to enhance search efficiency. Training of the model utilizes a dataset compiled from global literature sources. Notably, the LBAS- LS-SVM model demonstrates high predictive accuracy with an R2 of 0.983 and a low RMSE of 1.99 MPa on the test set. The width of FRP emerges as the most influential variable impacting bond strength. This innovative model holds promise for addressing similar regression challenges in structural engineering. Kurtoğlu et al. [84] focused on predicting the bond-slip behavior of anchored CFRP strips externally bonded to concrete surfaces, aiming to prevent premature debonding failures commonly observed in CFRP systems. Leveraging an SVM, the research develops predictive models using a robust database assembled from previous reports on FRP-to-concrete joints anchored with CFRP strips. Various critical input parameters encompassing concrete properties, anchor specifications, and CFRP characteristics are considered. The models predict key output parameters, including maximum shear capacity, residual shear capacity, and displacement values at peak and residual shear. Notably, the proposed models exhibit high prediction accuracy and minimal error rates. Additionally, the research provides these models in code format, facilitating their seamless integration into analysis software for practical implementation. To summarize, Table 7 illustrates the ML methods and their best performances for the estimation of the bond strength of FRP implemented on RC members.

Table 7 Summary of ML methods used for estimation of bond strength of FRP implemented on RC members

3.5 Compressive Strength of FRP-Confined Concrete

The compressive strength of FRP-confined concrete is a crucial parameter for various reasons. FRP wrapping or confinement of concrete enhances its compressive strength, allowing structures to bear higher loads and resist deformation, contributing to the overall structural integrity. Therefore, determining the compressive strength of FRP-confined concrete is essential for retrofitting and rehabilitating existing structures. It enables engineers to assess the potential improvement in load-bearing capacity when applying FRP strengthening to aging or damaged structures. Moreover, understanding the compressive strength of FRP-confined concrete helps ensure the safety and longevity of structures. It allows engineers to design with confidence, considering the increased strength imparted by FRP reinforcement. Furthermore, understanding the compressive strength of concrete confined with FRP assists in optimizing the use and design of FRP applications. This ensures efficient material utilization that aligns with the required strength criteria. Such knowledge contributes to cost-effective design strategies by preventing overdesign and excessive material usage, thereby maintaining structural safety.

Traditional methods reliant on empirical and statistical models to estimate compressive strength often involve laborious processes and might lack accuracy, especially in complex concrete-property relationships. Sofos et al. [85] targeted the precise prediction of concrete mechanical properties, which is crucial for innovative construction materials. They introduced a dataset enriched with material characteristics from uniaxial compression tests on FRP-confined concrete specimens. Twelve algorithms are tested, and the result demonstrates that meticulous dataset curation and algorithm selection result in a rapid, accurate computational model. This model has the potential to replace resource-intensive experiments, offering a viable solution for practical challenges in engineering and scientific domains. Deng et al. [86] focused on enhancing the accuracy of confinement models for FRP-confined concrete cylinders in structural engineering applications. By assembling a dataset from 221 experimental FRP-confined concrete cylinder specimens and selecting eleven critical input parameters, including confining stress and strain ratios, a robust ML technique called GMDH was implemented to build the confinement model. Comparisons with nine existing models showed that the GMDH model outperformed them, exhibiting significant accuracy with R2 of 0.97 for compressive strength and 0.91 for ultimate axial strain. Additionally, the study developed an intuitive graphical user interface (GUI) to provide swift and efficient engineering design references, which are freely accessible. Du et al. [87] introduced a data-driven model using Bayesian hyperparameter optimization to estimate ultimate behavior of FRP-confined concrete. Employing a database of 820 circular cross-section columns, Bayesian optimization XGBoost (BO-XGB) was compared against six empirical models and an un-optimized XGBoost regressor. Results showed that BO-XGB outperformed prior models in forecasting compressive strength and axial strain of FRP-confined concrete. Notably, BO-XGB demonstrated enhanced accuracy and stability, particularly in accounting for the influence of lateral confinement pressure on concrete-FRP interactions, setting it apart from previous models.

Ilyas et al. [88] introduced the use of an ML approach, which is multi-expression programming (MEP), to predict compressive strength of CFRP-confined concrete. It utilizes crucial parameters encompassing geometric and mechanical properties, such as specimen height, diameter, CFRP elasticity, concrete strength, and CFRP layer thickness. Extensive statistical analysis, encompassing validation against experimental data and external criteria, is conducted to assess the performance of the model. Further validation of the reliability of the model occurs through parametric analysis and comparison with existing models found in the literature, emphasizing its superior accuracy and predictability. Overall, this proposed MEP-based model demonstrates efficient prediction of CFRP-wrapped structural strength, offering potential utility in rehabilitation and retrofitting for sustainable construction materials. Kumar et al. [89] investigated the prediction of the compressive strength of FRP-confined concrete cylinders using both analytical and ML models. With a substantial dataset comprising 1151 specimens from diverse literature sources, the study employs GP, SVM, ANNs, optimized SVM, and optimized GP models. These models utilize input parameters encompassing specimen geometry, FRP composite properties, and concrete strength. Comparative analysis among these ML models and nineteen analytical models revealed the optimized GP model has the most accurate evaluation metrics in comparison to other models. Specifically, the optimized GP model outperformed others, demonstrating higher precision and efficiency in predicting compressive strength. Jamali et al. [90] investigated the compressive strength of FRP-confined concrete using a database encompassing 1066 specimens. The methods applied include MLP, ANNs, SVM, ANFIS, and their amalgamation with PSO and kriging interpolation. These methodologies were benchmarked against existing models from previous studies. The comparison demonstrates that the kriging interpolation method yields the most accurate estimation of compressive strength, indicating the lowest error among the tested models. Berradia et al. [91] aimed to estimate axial strength and strain of CFRP-confined concrete using both general regression analysis and ANNs. The database contained 364 concrete compressive member test results. The ANN models showed higher accuracy with R2 of 0.984 and 0.942 for strength and strain, respectively, along with lower error values (RMSE and MAE). In comparison, the empirical models had lower accuracy with R2 of 0.90 and 0.80 for strength and strain, respectively, and higher error values. This suggests that ANN models are more effective in predicting CFRP-wrapped concrete compressive member behavior. Valença et al. [92] introduced a contact-free method for measuring strain levels in CFRP laminates using computer vision, aiming for an efficient, automated, and accurate solution. By leveraging digitally deformed synthetic images generated from a low-resolution camera, the proposed architecture utilizes various methods, from traditional ML to deep learning. Techniques like dropout and cross-validation ensure effective uncertainty estimation for both traditional ML algorithms and ANNs. Among these methods, the ResNet34 deep learning architecture stands out, achieving an RMSE of 0.057‰ for strain prediction. Notably, this contact-free, automated, and cost-effective approach directly measures strain on laminate surfaces, making it applicable for widespread use in pre-stressed laminate applications.

Go et al. [93] focused on improving the prediction of the decline in tensile strength of GFRP bars under harsh conditions. They introduced an enhanced ensemble ML model for a more accurate prediction of the residual tensile strength of GFRP bars, utilizing experimental data on tensile strength retention. Different ML models, both individual and ensemble, were tested, showing varied accuracy in predicting strength deterioration. The proposed enhanced model achieved notably higher accuracy compared to previous studies, marking a significant advancement in estimating GFRP bar tensile strength under adverse conditions. Thomas et al. [94] aimed to streamline the determination of elastic constants and fiber orientation in short FRP composites, minimizing the need for extensive experimental testing. The methodology, demonstrated in extrusion deposition additive manufacturing (EDAM), allows inverse determination of fiber orientation and polymer properties through composite coupon-level tensile tests. Although initially applied to EDAM, this approach is adaptable to other short FRP systems. Typically, creating composites for additive manufacturing necessitates detailed material characterization, including orthotropic elastic properties obtained through complex sample preparation and micromechanics modeling. Accurately measuring fiber orientation is time-consuming, especially for non-cylindrical fibers. The proposed methodology, accelerated by ML, aims to identify anisotropic mechanical properties and fiber orientation simultaneously. Preliminary outcomes suggest that employing a specific micromechanics model alongside a limited number of tensile tests can derive the nine elastic constants and determine the fiber orientation. To provide a summary, Table 8 illustrates the ML methods and their best performances for the estimation of mechanical properties of FRP implemented on RC members.

Table 8 Summary of ML methods used for estimation of mechanical properties of FRP implemented on RC members

4 Challenges and Limitations of ML for FRP

One of the most helpful AI tools is ML, which makes it easier to build robust FRP materials and create other automatic learning systems that can make decisions without constant programming. However, this cannot hinder the advantages and disadvantages of ML, presenting a number of obstacles, difficulties, and constraints that must be addressed. First, complete, consistent, reliable, and abundant data must be available in order to apply ML. The black-box algorithms used to solve problems with the design of reinforced composite materials are frequently criticized for having poor interpretability since they do not incorporate the governing physical principles. Furthermore, the physic-chemical interactions of the components of reinforced composite materials are not adequately described for ML algorithms, leading to inaccurate estimations. This is due to the possibility that the outcome cannot be appropriately considered if the features above are not carried out during ML operations. Such circumstances can only be anticipated and prevented by comprehending the causal relationship between the inputs and outputs of the model.

Choosing an appropriate machine-learning method for a specific design task is often challenging, given the plethora of available algorithms. Moreover, the limited computer expertise among most material designers has significantly hindered the broader adoption of ML in reinforced composite technology. There are many databases available for FRPs, but there are drawbacks as well, like difficulty in obtaining these data resources, discrepancies in data produced by various groups, and shortcomings in the modern database. Data preparation is one of the most crucial elements necessary for any ML process. Because of this, the bulk of the effort needs to be focused on the data preparation stage since inadequately prepared data will prevent ML from yielding reliable findings. When handling material concerns, ML techniques might not always be the most cost-effective option if the total cost of training and design processes is more than that of standard methods.

Overfitting and underfitting are the two most common issues that arise when designing reinforced polymer composite materials with the assistance of ML. When a model performs poorly on new data because it has grown too accustomed to the information and noise in the training set, it is said to be “overfitted.” Stated differently, the model detects noise or erratic fluctuations in the training set and internalizes them as concepts. The problem lies in the fact that these concepts do not hold for new data, which reduces the generalization ability of the models. Overfitting is more common in nonparametric and nonlinear models since they are more flexible when learning a target function [95, 96]. Because of this, a lot of nonparametric ML techniques also include parameters or other methods to constrain and limit the amount of detail the model learns. Underfitting is the exact opposite of overfitting. If an ML model is trained with too little data, it becomes less accurate and produces partial and inaccurate results. When an ML model is too simplistic to understand the underlying structure of the data thoroughly, underfitting occurs. This usually happens when there is a limited quantity of data available from the data gathering, and material designers try to create a linear model utilizing non-linear data. In these cases, the ML model starts generating false predictions, its complexity is destroyed, and the rules are too fundamental to apply to this set of data.

5 Recommendations and Future Research Directions

For the most part, ML algorithms rely on data association to produce analyses and predictions. Material designers generally respond to particular and trustworthy linking correlations that are produced by logic from concrete and evident data. Hence, for creative and novel material manufacturing, a proper relationship between ML algorithms and the design of material manufacturers is essential. One of the main demands of material engineers may be to go from issue solutions by data correlation to problem-solving by logical reasoning. A benchmark dataset is required for the testing and construction of ML techniques in order to evaluate novel methods consistently. The present ML code and data-sharing culture will be enhanced by these benchmark datasets, as is customary in the computer and statistical sciences.

As complex neural network-based techniques become more popular, understanding the field of application of ML algorithms, interpretability, and utilization for outlier identification remain significant concerns that will get more challenging while employing newly developed ML models. The examined studies demonstrate that ML can be utilized to generate novel materials with desired properties. However, they mainly concentrated on applying ML techniques to predict material qualities instead of creating intelligent FRP materials with desired features. Future research may lead to innovative methods that leverage a current understanding of FRP design in training ML models. This advancement could overcome some inherent limitations of the technology.

With a few opportunities and issues that need to be looked into and resolved, the promise of using ML in FRP design has not yet been fully understood. Without the robustness and generalizability of human learning, ML algorithms continue to be primarily specialized instruments and are considered black boxes. Although there have been prior attempts to create prototype ML systems with unique features like lifetime learning and reliable representation schemes, it is uncertain whether human-like learning skills will be available. However, as large and complex as current computer resources allow, ML-assisted design of FRP materials will continue to grow. However, it will always depend on specialist training, which will be challenging to maintain as design jobs and environments change.

Because there are many different ML approaches for predicting various mechanical properties of FRPs, each with advantages and disadvantages, choosing the best ML algorithm can be difficult. This is especially true for researchers in this field, who are probably less knowledgeable about ML than those specialist in computer science. Different ML algorithms have been adopted in the performance prediction of FRP material under different loading and environments, but sensitivity analysis revealed that different input variables influence weights when using different algorithms. As a result, comparative studies are required to evaluate the effectiveness of various algorithms. The analysis should be conducted not only based on the statistical performance but also background knowledge of FRP structures.

It is advised to look into ML techniques like SVM, XGBoost, and Super-learner for other examples involving the design and retrofitting of RC members with FRP in light of the cases that have been analyzed. Also, most of the ML works concentrated on the beam retrofitted by FRP, while less studies were conducted on the column or slab retrofitted FRPs. Therefore, it is suggested to employ more novel ML algorithms on column and slab.

Adjusting the hyperparameters of ML algorithms is an essential part of predicting the mechanical properties of FRP structures. This is specifically the case for column, beam, and slab retrofitted items. One of the most crucial steps before suggesting the ML model for this purpose is figuring out which algorithm is appropriate and what its ideal hyper-parameters are for different applications, e.g. compressive strength, and bond strength. This is because each algorithm is unique and best suited for a specific application, particularly when it comes to classification challenges. The majority of previous studies lacked specific ideal hyperparameters calibrated for the used ML algorithms or an explanation of how to choose hyper-parameters. Consequently, for ML algorithms to be used practically by structural engineers without prior ML experience, a thorough calibration framework is required.

Despite the fact that these algorithms can yield accurate answers, many experts continue to doubt their level of reliability because the user is unaware of the specifics of the processes taking place behind the scenes. Understanding how various contributing factors contributed to the final anticipated values is challenging. This will lead to a lack of confidence when applying them to real-world issues. Adopting physics-based models, feature importance studies, or techniques like SHAP in conjunction with prediction algorithms are some solutions to this issue. Once more, these are some of the topics that require further investigation. Although many complicated engineering problems have answers that are more accurate now that ML has been applied, further study addressing the issues above is needed to give confidence for the use of this approach to real-world problems. Adopting physics-based models, feature importance studies, or techniques like SHAP in conjunction with prediction algorithms are some solutions to this issue.

6 Conclusions

FRP exhibits exceptional strength-to-weight ratios, excellent fatigue characteristics, and remarkable resistance to corrosion. When used in externally bonded wraps for RC beams, FRP demonstrates impressive capabilities in enhancing both shear and torsion carrying capacities. The tensile steel and the bond between the FRP and concrete primarily govern the fatigue life of FRP-reinforced RC beams. Its superior ductility and absorbing capacity afford FRP high impact and blast resistance. CFRP strengthening stands out for its superior strength improvement and effectiveness in challenging environmental conditions.

Researchers are interested in ML due to its tremendous potential for accurately solving structural engineering challenges. The exponential rise in publications by researchers on ML techniques to solve challenging engineering challenges has been highlighted recently. Many academics have previously shown this approach to have the ability to solve regression and classification problems accurately. The fundamental terms used in ML are covered thoroughly in this paper, along with a detailed explanation of the AI algorithms utilized in numerous studies. It will make it easier for beginners in this discipline to grasp the fundamentals of ML and to ease the use of its potential in their future studies. To illustrate the potential of ML in generating precise predictions for issues associated with FRP structures, this study presents a comprehensive review of the literature on predicting the mechanical properties of such structures. This study also includes a thorough examination of the main obstacles encountered in this field and the direction in which it may go in the future. The following are the main findings from the present investigation.

  • The quality and quantity of the databases used for training and validation have a significant impact on the accuracy of ML-based algorithms.

  • Modern codes and standards can apply ML, a quickly developing subject, in place of empirical and semi-empirical prediction models.

  • In order to increase the accuracy of the final model, feature selection techniques are strongly advised, depending on the features and method utilized.

  • An enough number of samples should be included in the dataset to reflect all conceivable variations of each characteristic.

  • It is essential to make sure that the samples in the chosen dataset span the most comprehensive range feasible.

  • Most of the works utilized ANNs and SVM models for developing prediction algorithms.

  • There has been an increase in the usage of stacked and ensemble ML techniques, which recently resulted in better performance than conventional methods.

  • Hyperparameter optimization should be employed, based on the developed model, to ensure reaching the highest accuracy.

  • It is necessary to create new ML algorithms that are simple to understand in order to utilize them securely and confidently to address problems related to the structural performance of FRPs.

  • XGBoost showed better performance than most of the models for predicting the shear strength of the beam retrofitted by FRP.

  • For the axial loads and compressive strengths of beam retrofitted by FRP, mostly neural networks based models provided higher accurate prediction.

  • Boosting-based algorithms, specifically XGBoost had outstanding performance for predicting the axial load and compressive strengths of column retrofitted FRPs.

  • For the interfacial bond strength, there is no firm conclusion for the model with the best performance. It is suggested that more stacked models could be applied for this property of RC columns.

  • Deep learning models could be employed in order to verify the authenticity of the results given by other models comparing with them,

  • Graphical User Interface could be further complimented most of the studies as the user could conveniently have the output parameters given the input variables.

  • Environmental factors should be also considered for most of the studies as they have influence on the mechanical properties of FRC structures.