Introduction

Self-compacting concrete (SCC) is a type of concrete that can flow and compact under its weight, without the need for external vibration. It is highly fluid concrete that can fill complex forms and confined spaces, making it ideal for use in precast, prestressed, and reinforced concrete structures (Aslani et al., 2018). SCC has become increasingly popular worldwide due to its many benefits in the construction industry. Its fast-paced construction, enhanced workability, reduced labor requirements, and superior finish makes it an attractive option for builders (Singh et al., 2020).

With 65–80% of the concrete's volume made up of aggregate, the concrete structural component heavily relies on it. In addition to durability, it gives concrete strength, permeability, volume stability, and workability (Faraj et al., 2019). Large volumes of coarse and fine aggregates are essential to supply the extensive global demand for concrete (Spiesz et al., 2016). To reduce waste and environmental impact, recycled materials can be used to prepare new concrete (Saikia & de Brito, 2013). Reusing recycled materials instead of mining new aggregates also alleviates aggregate shortages on construction sites. Every year, humans produce a variety of plastics to meet their needs. Unfortunately, most of these plastics are created for single-use and have very low biodegradability, making them difficult to recycle. As a result, this plastic waste contributes to increasing environmental pollution. Accordingly, Recycled Plastic Aggregate (RPA) in SCC has been an area of increasing interest in recent years. RPA is made from waste plastic material, such as plastic bottles, bags, and packaging. It is a sustainable and environmentally friendly alternative to traditional aggregates, such as gravel and crushed stone (Gu & Ozbakkaloglu, 2016). Research has shown that using RPA in SCC can improve the mechanical properties of the concrete, such as compressive strength, tensile strength, and flexural strength. RPA is also non-porous, which can help improve the durability and water resistance of the concrete (Gu & Ozbakkaloglu, 2016). Additionally, using RPA in SCC can also reduce the carbon footprint of the concrete, as it reduces the need for virgin materials and the amount of waste plastic sent to landfills (Singh et al., 2020). Overall, the use of RPA in SCC is a promising area of research, with many benefits in terms of sustainability, durability, and strength. However, more research is needed to fully understand the potential challenges and limitations of using RPA in SCC and to optimize the use of RPA in SCC for different applications.

SCC's compressive strength (CS) is a critical aspect of engineering structures. Shariati et al. (2022) noted that multiple cubic and cylindrical samples are made and tested at varying curing ages to determine the CS of SCC. The testing procedure is therefore time-consuming and expensive, and work on a construction site should also not begin until the results of the CS test are achieved at a specified age, such as 28 days. In contrast, because of the significant influence of mix proportions and components on its characteristics, determining the CS of SCC without doing experimental experiments has long been a hurdle in concrete technology (Shariati et al., 2022). This is particularly evident with the use of pozzolanic materials such as ground granulated blast furnace slag (GGBFS), limestone powder (LP), fly ash, and silica fume to partially replace cement, and recycled aggregates (RAs) to replace natural aggregates. To reduce the need for laboratory testing, improved approaches are needed to provide engineers with simpler methods and mathematical formulas for predicting experimental results.

Soft computing approaches such as genetic algorithms, fuzzy logic systems, and support vector machines have also been used to predict the mechanical properties of concrete. Additionally, a hybrid model using fuzzy logic and particle swarm optimization (PSO) is used to predict the compressive strength of high-performance concrete (HPCC) (Gao et al., 2019). In conclusion, soft computing techniques are a suitable solution for predicting the compressive strength of different types of concrete as it provides a powerful tool for analyzing and predicting the mechanical characteristics of cement-based materials (Kumar et al., 2023).

Artificial neural networks (ANNs), such as backpropagation networks (BP) (Kaveh & Khalegi, 2000; Rumelhart et al., 1986), radial basis function networks (RBF) (Moody & Darken, 1989), and others, are considered crucial components of computational intelligence. Over the years, the effectiveness of ANNs has improved by leveraging a blend of diverse networks and algorithms. Nevertheless, it is important to note that the utilization of various network types varies depending on the specific scenario or application at hand (Kaveh & Khalegi, 1998, 2000; Kaveh & Khavaninzadeh, 2023; Kaveh & Lranmanesh, 1998; Kaveh et al., 2001, 2008, 2021; Rofooei et al., 2011).

On the other hand, other methods based on improving the regression analysis using AI function are used in the literature. One of the methods is the multi-objective genetic algorithm evolutionary polynomial regression (MOGA-EPR) method is indeed a popular and effective method for predicting various models within the field of civil engineering. MOGA-EPR is a multi-objective optimization method that uses genetic algorithms to find optimal solutions for complex problems. This method has been applied to various engineering problems, including the prediction of structural behavior, soil-structure interaction, water-structure interaction, and other related areas (Al Hamd et al., 2022; Alzabeebee et al., 2022a, 2022b, 2022c, 2023; Zuhaira et al., 2021).

Additionally, gene expression programming (GEP) (Ferreira, 2006), a relatively new modeling method, has demonstrated superior performance compared to regression methods when it comes to acquiring mathematical relationships in some experimental studies (Abd Elhakam et al., 2012). Numerous studies have highlighted the multiple benefits of GEP over traditional regression methods. Unlike classic regression methods, which analyze predefined functions retrospectively, GEP does not rely on predefined functions. As a result, GEP is considered more robust than other regression methods and even neural networks in terms of modeling and deriving mathematical relationships for experimental studies involving multivariate problems (Bhargava et al., 2011; Ganguly et al., 2009; Shahmansouri et al., 2020).

This research aims to investigate the influence of changing mixture proportions on the compressive strength (CS) of self-compacting concrete (SCC) from 7 to 400 days of curing. To this end, data has been obtained from 400 previous tests (Faraj et al., 2022), were examined. Many model approaches were used to predict the CS of SCC with various RP aggregates, including the multi-objective genetic algorithm evolutionary polynomial regression (MOGA-EPR) model and two models employing gene expression programming (GEP). RP aggregate content, binder (limestone powder, GGBFS, fly ash, silica fume, or their combination), natural fine and coarse aggregate, water-to-binder ratio (w/b) ratio, and superplasticizer dose (SP) content were among the factors taken into account.

This paper utilizes MOGA-EPR and GEP techniques due to their explicit mathematical models which can be interpreted for further exploration (Alzabeebee et al., 2022a, 2022b, 2022c). The use of MOGA-EPR and GEP techniques in this paper is novel because these techniques have explicit mathematical models that can be easily interpreted, making them useful for further exploration and analysis. By using these techniques, the paper's authors aim to provide a more detailed and interpretable understanding of predicting the compressive strength of SSC concrete containing recycled plastic. This will impact future research and provide a means for practitioners to easily access quick calculations.

Methodology

In this study, the predictability of the compressive strength (CS) of green self-compacting concrete (SCC), including recycled plastic aggregates (RPA) is investigated using the MOGA-EPR and GEP models. An experimental database compiled from the current literature serves as the foundation for the data utilised to train and evaluate the CS. The results of the MOGA-EPR and two GEP models will then be compared to the linear regression (LR) model established in the literature by Faraj et al. (2022).

Figure 1 summarizes the methodology and procedure used in this investigation in terms of a flowchart. This flowchart starts with data collection, statistical analysis, data grouping, developing models, calculating statistical indicators, analysis of the results, assessment and analyzing sensitivity studies.

Fig. 1
figure 1

Flowchart process for this paper methodology

Data collection and statistical analysis

Table 1 summarizes the statistical measures for experimental data from the reference Faraj et al. (2022) with a total of 400 observations. It includes the minimum, maximum, average, and standard deviation of the SCC mix proportions, the RP aggregate contents, and the measured CS values.

Table 1 Statistical measures of the collected data

In Table 1, the input data set, which includes, binder content (BC), water to binder ratio (w/b), superplasticizer dosage (SP), recycled plastic aggregate (RPA), natural fine aggregate content (FA), natural coarse aggregate content (CA), curing time (CT), and the output measured Compressive Strength (CS). In this table, the statistical measures are shown, which are the dataset's minimum, maximum, average, and standard deviation (STDEV).

Data grouping

In this paper, the performance of two alternative approaches to predict the compressive strength (CS) of self-compacting concrete (SCC) with varying recycled aggregates (RP) was compared to the linear regression model (LRM) developed by Faraj et al. (2022). The first approach is MOGA-EPR, and the second is GEP. The data collected was split into two sets: 80% was used to train the models, while the remaining 20% was used to test their performance. The results were then compared to the LRM approach as shown in Tables 2 and 3.

Table 2 Statistical measures of the training dataset
Table 3 Statistical measures of the testing dataset

For the MOGA-EPR and GEP models, Tables 2 and 3 show the statistical measures of the training and testing datasets. The statistical measures include the minimum, maximum, average, and standard deviation of the input and output (CS) datasets.

Developing the models

In this paper, two methods, the MOGA-EPR and GEP, were used to estimate the Compressive Strength (CS) of Self-Compacting Concrete (SCC) with various amounts of Recycled Aggregates (RP). The development of the models is explained below:

Multi-objective evolutionary polynomial regression (MOGA-EPR)

The Multi-objective evolutionary polynomial regression analysis (MOGA-EPR) is a computational technique that uses input data to solve practical problems (Table 4). It uses a genetic algorithm (GA) to create a mathematical correlation that depicts the relation between the physical input variables and is based on regression analysis. Multiple targets are added to the EPR-GA, MOGA's increasing the correlation's accuracy and fitness while reducing its complexity. The benefits of this regression method over classical regression include: finding the best correlation automatically through a search algorithm and overcoming the overfitting problem commonly seen in other regression methods. To use the EPR-MOGA, the user must determine the correlation's structure, the range of exponents, and the number of terms. A more in-depth explanation of the EPR-MOGA can be found in (Alani et al., 2014; Alzabeebee et al., 2022a, 2022b, 2022c; Assaad et al., 2021; Giustolisi & Savic, 2006; Zuhaira et al., 2021). 

The devolved models are based on eight input variables which are CS, BC, w/b, CT, SP, RPA, FA and CA. Table 4 displays the MOGA-EPR model equation (Equation (1)), which is used to predict the CS of SCC with various amounts of RP.  The compressive strength (CS) of the material is measured in MPa and is dependent on the binder content (BC) in kg/m3, the water-to-binder ratio (w/b), the curing time (CT) in days, the superplasticizer dosage (SP) in kg/m3, the recycled plastic aggregate (RPA) in kg/m3, the fine aggregate (FA) in kg/m3, and the coarse aggregate (CA) in kg/m3.

Table 4 MOGA-EPR model equation

Gene expression programming (GEP)

The GEP algorithm is an expanded version of genetic programming (GP) (Ferreira, 2001), and has been demonstrated to be effective in modelling complex and nonlinear processes (Aytek & Kişi, 2008; Faradonbeh et al., 2016). This paper uses GEP to predict the compressive strength (CS) of self-compacting concrete (SCC) with various recycled aggregate (RA) components. GEP encodes individuals in the form of linear chromosomes of fixed lengths, which can then be expressed as tree structures (Ferreira, 2001). Genetic operators, such as mutation and recombination, can be employed on the linear structure of chromosomes, thereby creating valid and correct structures for solutions.

In this paper, GEP analysis was conducted using the GeneXproTools software (Gandomi et al., 2015). The initial population of solutions was generated by incorporating a selection of functions, such as basic arithmetic functions, trigonometric, logarithmic, and polynomial functions, and terminals, which are constant values and independent problem variables. The chromosomes were then presented as tree expressions. The fitness of each member of the population was evaluated with the fitting function (Koza, 1992).

The GEP model follows an iterative process to generate a desirable solution for a given problem (Ferreira, 2001). It starts by randomly producing chromosomes from the initial population, which are expressed as tree expressions. Then, the degree of desirability and compatibility of each chromosome is evaluated. The program is terminated if the desired conditions are met, and the present population displays the answer. The most outstanding individuals from the current population are retained, while the rest are chosen based on their performance. Subsequently, corrections and enhancements are made to the selected population, creating offspring with new traits. These new progenies are then subjected to the same developmental cycle, and the process is repeated for a set number of generations to acquire a satisfactory solution.

 Similar to the MOGA-EPR model, the devolved models by the GEP use the same eight input variables. The primary setting parameters and adjustments of the GEP model are presented in Table 5.

Table 5 The main setting parameters and adjustments of GEP models

Equations 2 and 3 for the two models of the GEP are provided in Table 6. The compressive strength (CS) of the material is measured in MPa and is dependent on the binder content (BC) in kg/m3, the water-to-binder ratio (w/b), the curing time (CT) in days, the superplasticizer dosage (SP) in kg/m3, the recycled plastic aggregate (RPA) in kg/m3, the fine aggregate (FA) in kg/m3, and the coarse aggregate (CA) in kg/m3.

Table 6 GEP models equations

It has been demonstrated by Eqs. 1–3 that the impact of the main variables presented by Faraj et al. (2022) has been factored into the predicted CS.

In the following section, the statistical metrics for the various models will be calculated and discussed, and the findings from the models will be contrasted.

Statistical indicators and measurements

An assessment of the new and existing analytical methods was performed using statistical indicators, such as mean absolute error (MAE), root mean squared error (RMSE), mean, and coefficient of determination (R2) (Eqs. 47). This same accuracy evaluation method has been used in many past studies (e.g., Alkroosh et al., 2015; Huang et al., 2019; Kordnaeij et al., 2015; Tinoco et al., 2020; Zhang et al., 2020) The MAE and RMSE values describe the ideal fit as the lower means. The mean value should ideally be 1.0; values higher than this indicate an overall overprediction of the compressive strength (CS) and lower values indicate an overall underprediction.

$$\mathrm{MAE}= \frac{1}{n}\sum_{1}^{n}\left|{CS}_{p}-{CS}_{m}\right|$$
(4)
$$\mathrm{RMSE}=\sqrt{\frac{1}{n}\sum_{1}^{n}{\left({CS}_{p}-{CS}_{m}\right)}^{2}}$$
(5)
$$\mathrm{Mean}=\frac{1}{n}\sum_{1}^{n}\left(\frac{{CS}_{p}}{{CS}_{m}}\right)$$
(6)
$${\mathrm{R}}^{2}={\left(\frac{\sum_{i=1}^{n}({CS}_{p}-{{CS}_{p}}_{average})({CS}_{m}-{{CS}_{m}}_{average})}{\sqrt{\sum_{i=1}^{n}{({CS}_{p}-{{CS}_{p}}_{average})}^{2}\sum_{i=1}^{n}{({CS}_{m}-{{CS}_{m}}_{average})}^{2}}}\right)}^{2}$$
(7)

In Eqs. 47, \(n\) is the number of data points used in the assessment, \({CS}_{p}\) is the predicted compressive strength, and \({CS}_{m}\) is the measured compressive strength.

Results

The statistical indicators and measurements in the previous section have been used to calculate the mean absolute error (MAE), root mean squared error (RMSE), mean, and coefficient of determination (R2) for the prediction CS values compared to the measured CS for the training and testing datasets for each model of the MOGA-EPR and GEP approaches, as shown in Table 7 and Fig. 2.

Table 7 Statistical accuracy analysis of the developed models for both datasets
Fig. 2
figure 2

Statistical accuracy analysis of the developed models for both datasets ad for the training dataset and eh for the testing dataset

Results from Table 7 and Fig. 2 indicate that the MAE for the developed approaches lies between 7.1 and 7.7 for the training datasets, and 7.8 and 8.6 for the testing datasets. In terms of RMSE, the training datasets yield scores between 8.8 and 9.7, while the testing datasets have scores between 10.0 and 10.8. The mean of the datasets ranges from 1.03 to 1.06 for the training datasets and 1.04 to 1.06 for the testing datasets. Lastly, the R2 scores for the training datasets are between 0.82 and 0.85, and 0.78 to 0.80 for the testing datasets.

The statistical indicators calculated from the training and testing datasets presented in Table 7 and Fig. 2 are promising and fairly close to each other. The MOGA-EPR model has the highest R2 compared to the two GEP models, with GEP model (2) having a higher R2 than GEP model (1). Additionally, MOGA-EPR has the lowest MAE and RMSE values among the models. The mean values are all close to 1 for all models.

Figures 3 and 4 display the comparison between the predicted and measured values for the training and testing data, respectively, for the three models. These figures show that the majority of the predictions resemble the perfect fit line and that most of them are within the ± 20% error, which suggests accurate predictions.

Fig. 3
figure 3

Relationship between measured and predicted CS using the developed models for the training dataset: a MOGA- EPR, b GEP model (1) and c GEP model (2)

Fig. 4
figure 4

Relationship between measured and predicted CS using the developed models for the testing dataset: a MOGA- EPR, b GEP model (1) and c GEP model (2)

Comparison with the LR model suggested by Faraj et al. (2022)

In this paper, the three models from two different approaches are developed and compared with the Linear Regression model (LRM) suggested by Faraj et al. (2022).

Table 8 and Fig. 5 compare the statistical indicators between 4 models. From this table and figure, the MOGA-EPR model shows the best behavior compared to the other models by mean lower MAE, RMSE, and the mean value. On the other hand, it gives the highest R2.

Table 8 Statistical accuracy comparison of the developed model with LRM using all of the data
Fig. 5
figure 5

Statistical accuracy comparison of the developed model with LRM using all the data

The capability of data-driven models is specified compared to the LR model with the cumulative frequency of the error level in percentage reported in Fig. 6. It is evident from this figure that the MOGA-EPR and GEP models are close to each other and better at predicting CS than the LR model. Moreover, Fig. 7 displays the residual error between each of the different models and the measured values of the CS, demonstrating the accuracy of the MOGA-EPR and GEP models in predicting the values of the CS. It is worth emphasizing that the MOGA-EPR model is much more straightforward than the other models. Therefore, it is suggested to be utilized to prevent mistakes in computations.

Fig. 6
figure 6

Comparison of the error level for different cumulative frequencies

Fig. 7
figure 7

Residual errors for the models using all of the data

Sensitivity studies

After assessing the CS values from various models in the previous sections, the MOGA-EPR model was picked to conduct further sensitivity studies. The selection of the MOGA-EPR models is due to its simplicity; thus, it could be easily utilized to conduct a sensitivity analysis of the influencing parameters on the compressive strength. These studies will demonstrate how changing the values of the input variables affects the CS of SCC.

The effect of changing the recycled plastic aggregates content (RPA)

This paper aims to predict the CS of SCC with RPA. This study will investigate the effect of increasing the RPA on CS, while keeping the other factors (as indicated in Table 1) constant at the average values.

Changing the RPA in SCC can affect the compressive strength of the concrete. RPA is a type of aggregate made from recycled plastic and used as a partial substitution for other coarse aggregates in concrete. As mentioned earlier in “Introduction”, using RPA can help reduce the environmental impacts of concrete production.

Basha et al. (2020) have indicated that raising the amount of RPA in self-compacting concrete can duce its compressive strength, a consequence of the debonding between RPA and the mortar matrix. Moreover, RPA is more prone to absorbing water, which increases the porosity of the concrete (Rachedi, 2018), which can lead to a decrease in the strength of the concrete. The same can be seen in Fig. 8a, which shows the inverse correlation between RPA and CS. The two variables have an inverse relationship when looking at the graph.

Fig. 8
figure 8

Sensitivity studies

The effect of changing the water-to-binder ratio (w/b)

The CS of SCC with RPA is dependent on the ratio of w/b within the mix; an increase in the water-to-binder ratio of the SCC mix can lead to a decrease in CS due to the excess water increasing porosity and reduce the strength of the concrete (Belalia Douma et al., 2017). This is demonstrated in Fig. 8b, which indicates that the CS increases as the w/b ratio increases until it reaches its peak at 0.57, after which the CS decreases, with other factors in the mix remaining at their average values indicated in Table 1.

The effect of changing the binder content (BC)

Its BC impacts the CS of SCC. As the BC increases, the CS also increases due to the addition of more cement paste, which makes the concrete denser and thus increases its strength (Adam, 2011). This trend is seen in Fig. 8c, where the CS rises with an increase in the BC, while all other factors (listed in Table 1) remain constant.

The effect of changing the superplasticizer dosage (SP)

Adding superplasticizers to self-compacting concrete (SCC) enables the production of concrete with high workability, flowability, and water content (Ravindrarajah et al., 2003). Superplasticizers are also known to improve the compressive strength (CS) of SCC (Benaicha et al., 2019). In this study, increasing the SP will be studied on the CS, while keeping the other factors (as indicated in Table 1) constant at the average values. As illustrated by Fig. 8d, the findings reflect the outcomes of the prior studies by demonstrating a correlation between an increase in SP and an increase in CS.

Conclusions

The results of this study demonstrated that the novel MOGA-EPR and GEP techniques are highly effective for predicting Self-compacting Concrete containing Recycled Plastic Aggregates' compressive strength. Three models were developed, providing a simple and powerful tool for designers to use. The new approaches achieved significantly improved accuracy compared to the Linear Regression model (LRM).

The results of this study express the following conclusions with taking into consideration the constraints of the study:

  1. 1-

    The three proposed models have a higher accuracy than the existing LRM in the available literature, with R2 values ranging from 0.81 to 0.84 compared to R2 values below 0.75 for the current model.

  2. 2-

    The first proposed model, MOGA-EPR, showed the most significant accuracy with MAE of 7.1 and 7.8 for training and testing datasets, respectively, RMSE of 8.8 and 10.0, respectively, Mean of 1.04 and 1.06, and R2 of 0.85 and 0.80 respectively.

  3. 3-

    The GEP models (1 and 2) also showed good accuracy. Where the MAE values of 7.7 and 7.3, RMSE of 9.7 and 9.5, Mean of 1.06 and 1.03, and R2 of 0.82 and 0.83 for training and testing datasets, respectively.

  4. 4-

    The sensitivity studies revealed the effects of the recycled plastic aggregates (RPA) content, superplasticizer (SP) dosage, and binder content (BC) on the compressive strength (CS) of self-compacting concrete (SCC).

The conclusion of this study highlights the potential impact of the proposed system in the future. The system uses a well-established soft computing method, combined with an artificial intelligence algorithm, to develop three practical models. These models have the potential to be widely adopted and impact various industries, as they provide a consistent and effective solution for addressing complex problems. Implementing these models could lead to significant advancements in multiple fields and result in substantial benefits for practice engineers and researchers.

While GEP proves to be highly useful in civil engineering, researchers should consider several limitations. These limitations include complexity and parameter tuning, requiring domain expertise and careful configuration of operators, functions, and terminal sets. Overfitting is also a concern, as GEP, like other modeling methods, can become overly complex and fail to generalize well to unseen data. The interpretability of GEP's complex tree structures poses challenges in extracting meaningful insights and explaining underlying mechanisms. Additionally, GEP's performance is reliant on the quality and quantity of available data, with insufficient or noisy data hindering accurate pattern capture. GEP can be computationally intensive, particularly for large datasets or complex problems, limiting its practical application in certain scenarios. Furthermore, GEP's handling of categorical data may not be as effective, necessitating preprocessing and encoding to numeric formats. Despite these limitations, GEP remains a valuable and promising modeling approach, with ongoing research focused on overcoming these challenges and enhancing its performance and applicability across various domains.