Comments on “Splitting tensile strength prediction of sustainable high-performance concrete using machine learning techniques” by Wu, Yangi et al., https://doi.org/10.1007/s11356-022-22048-2

Kisi, Ozgur; Ajri, Sara; Jörgens, Kim Cedric; Karande, Arti; Kraus, Sabine; Naumann, Benita; Nierman, Kim; Seel, Wiebke; Kulls, Christoph

doi:10.1007/s11356-023-28829-7

Comments on “Splitting tensile strength prediction of sustainable high-performance concrete using machine learning techniques” by Wu, Yangi et al., https://doi.org/10.1007/s11356-022-22048-2

Letter to the Editor
Open access
Published: 21 July 2023

Volume 30, pages 109854–109855, (2023)
Cite this article

Download PDF

You have full access to this open access article

Environmental Science and Pollution Research Aims and scope Submit manuscript

Comments on “Splitting tensile strength prediction of sustainable high-performance concrete using machine learning techniques” by Wu, Yangi et al., https://doi.org/10.1007/s11356-022-22048-2

Download PDF

Ozgur Kisi^1,2,
Sara Ajri¹,
Kim Cedric Jörgens¹,
Arti Karande¹,
Sabine Kraus¹,
Benita Naumann¹,
Kim Nierman¹,
Wiebke Seel¹ &
…
Christoph Kulls¹

896 Accesses
Explore all metrics

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The discussers would like to thank the authors for investigating the efficiency of machine learning methods in predicting splitting tensile strength of high-performance concrete (HPC). In the discussed study (Wu and Zhou 2022), two hybrid genetic algorithm (GA)-based artificial neural networks (ANN) and grid search (GS)-based support vector regression are compared with classical ANN using empirical data collected from published literature and they conclude that the GS-SVR performance is superior to the other methods in predicting splitting tensile strength of HPC. The discussers would like to raise some important issues, which can be taken into account by the authors for future studies and audiences.

In the study of (Wu and Zhou 2022), the parameter settings of the GA-ANN have been provided in Table 1 (see Table 2 in the original study). In this table trainlm is a MATLAB command indicating the Levenberg–Marquardt (LM) algorithm which is used for training classical ANN. We wonder how the authors could use this algorithm in training while they also state that they calibrated ANN using GA which is an evolutionary method and that does not use LM. Furthermore, there is no information about cross-over or mutation operators and their values/settings which are very important in GA optimization. How did you find the optimal settings for the GA-ANN model? If you used a trial and error approach, which method did you follow? Did you optimize the hidden node number by setting generation and population size constant or was another procedure applied? These open questions need further explanation and do not make the GA-ANN model reproducible. On the other hand, there is not enough information about the classical ANN model (e.g., iteration number, activation functions, hidden node number) developed in Wu and Zhou (2022). Which software or program did the authors use for models simulations? Where is the parameters of GS and how did you select them? Such things weaken the replicability of the scientific studies. In research studies which use machine learning methods, details of the modeling process (e.g., software/program implemented in simulation, ranges of trials for the model parameters and their optimal values for the best model) should be clearly provided by the authors to allow for a reproduction and validation (Wu et al. 2014).

Table 1 The parameter settings for GA-ANN (obtained from Wu and Zhou 2022)

Full size table

Wu and Zhou (2022) attempted to compare their models with literature (Nguyen et al. 2022) using RMSE, MAE and MAPE. However, such statistics are highly dependent on data range and data length and range and it is difficult to use them for direct comparison of the two different studies. To be able to compare two studies using RMSE, MAE or MAPE, data split ratio should be the same. The discussers checked the study of Nguyen et al. (2022) and saw that they applied tenfold cross-validation which means that the testing data are not comparable to those of Wu and Zhou (2022). Furthermore, the authors (Nguyen et al. 2022) mentioned some missing data and that they filled missing parts with the mean of the available data. On the other hand, the standard deviations of data provided by Nguyen et al. (2022) are not same as of data provided by Wu and Zhou (2022). This needs further clarification.

Wu and Zhou (2022) state that they used 80% of the whole data set for training and 20% for testing in the first paragraph of the results and discussion section. However, they also write that the GS-SVR model was trained using tenfold cross-validation. The discussers could not understand the methodology followed by the authors. Does this mean that the GA-ANN and GS-SVR models were not trained following the same data split rule? What do you mean by tenfold cross-validation? Did you split the entire data set into 10 folds and tested the models with each part (or fold)? If yes, in this case, testing data set will not be same than that of GA-ANN in which 20% was used for testing. Why didn’t you apply tenfold cross-validation for GA-ANN and ANN also? This point also needs better clarification.

In the abstract of Wu and Zhou (2022), the authors write `the optimized models were used to train and test the data set, …`. This statement is not correct because the optimized models cannot be used to train and test data sets. Training set is used to optimize models and then the obtained models are tested and evaluated using testing data set based on selected evaluation metrics. Again in the same section, they say `… and contribution of these input variables on the output …` but input variables were not mentioned before. An expansion has not been provided for the GA-ANN and GS-SVR in the text (neither in abstract nor in introduction and other sections). I could understand the meaning after carefully reading research methodology section. In the 4^th paragraph of the introduction section, the authors define the ANN, extreme learning machine (ELM), SVR, decision tree, random forest, gradient boosting and adaptive boosting as machine learning (ML) algorithms. This is also not correct because all these are ML methods and training algorithms are used to obtain their model parameters. Such wrong statements are also seen in the last paragraph of the introduction section (`… with machine learning algorithms… … Section 2 describes the two machine learning optimization algorithms GA-ANN and GS-SVR…).

In the caption of Fig. 6, the authors mention the rough and fine selection but they did not give any information for these in the text. Which methods did they use for these selections? There is no equation and reference for the GA-ANN definition part while GS-SVR has several equations and references. It would be better to calibrate both SVR and ANN using same algorithm, GA or GS. The other issue worth to mention here is that the abbreviations of the input parameters could be used in Table 1, Fig. 4 and whole text instead of x₁, x₂, …, x₁₂ to be able to better follow the difference between each parameter.

References

Nguyen MST, Trinh MC, Kim SE (2022) Uncertainty quantification of ultimate compressive strength of CCFST columns using hybrid machine learning model. Eng Comput 38(Suppl 4):2719–2738. https://doi.org/10.1007/s00366-021-01339-1
Article Google Scholar
Wu W, Dandy GC, Maier HR (2014) Protocol for developing ANN models and its application to the assessment of the quality of the ANN model development process in drinking water quality modelling. Environ Model Softw 54:108–127. https://doi.org/10.1016/j.envsoft.2013.12.016
Article Google Scholar
Wu Y, Zhou Y (2022) Splitting tensile strength prediction of sustainable high-performance concrete using machine learning techniques. Environ Sci Pollut Res 29:89198–89209. https://doi.org/10.1007/s11356-022-22048-2
Article Google Scholar

Download references

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Department of Civil Engineering, Lübeck University of Applied Sciences, 23562, Lübeck, Germany
Ozgur Kisi, Sara Ajri, Kim Cedric Jörgens, Arti Karande, Sabine Kraus, Benita Naumann, Kim Nierman, Wiebke Seel & Christoph Kulls
Department of Civil Engineering, Ilia State University, 0162, Tbilisi, Georgia
Ozgur Kisi

Authors

Ozgur Kisi
View author publications
You can also search for this author in PubMed Google Scholar
Sara Ajri
View author publications
You can also search for this author in PubMed Google Scholar
Kim Cedric Jörgens
View author publications
You can also search for this author in PubMed Google Scholar
Arti Karande
View author publications
You can also search for this author in PubMed Google Scholar
Sabine Kraus
View author publications
You can also search for this author in PubMed Google Scholar
Benita Naumann
View author publications
You can also search for this author in PubMed Google Scholar
Kim Nierman
View author publications
You can also search for this author in PubMed Google Scholar
Wiebke Seel
View author publications
You can also search for this author in PubMed Google Scholar
Christoph Kulls
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ozgur Kisi.

Additional information

Responsible Editor: Philippe Garrigues

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kisi, O., Ajri, S., Jörgens, K.C. et al. Comments on “Splitting tensile strength prediction of sustainable high-performance concrete using machine learning techniques” by Wu, Yangi et al., https://doi.org/10.1007/s11356-022-22048-2. Environ Sci Pollut Res 30, 109854–109855 (2023). https://doi.org/10.1007/s11356-023-28829-7

Download citation

Received: 16 February 2023
Accepted: 12 July 2023
Published: 21 July 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s11356-023-28829-7

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Comments on “Splitting tensile strength prediction of sustainable high-performance concrete using machine learning techniques” by Wu, Yangi et al., https://doi.org/10.1007/s11356-022-22048-2

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation