Abstract
This study explores the efficacy of using a simplified molecular input line entry system (SMILES) as the sole feature, replacing quantum chemical properties (QCP), in predicting corrosion inhibition efficiency (CIE) for N-heterocyclic compounds. The gradient boosting regressor (GBR) model outperforms k-nearest neighbors (KNN), support vector regression (SVR), and other models. SMILES accurately predicts CIE for various datasets, demonstrating potential as a standalone feature. Results indicate a moderate correlation between SMILES representation and corrosion inhibition properties. The proposed method identifies novel N-heterocyclic derivatives with high CIE, suggesting its utility in discovering corrosion inhibitors.
Graphical abstract
Similar content being viewed by others
Data availability
Data will be shared on reasonable request.
References
M. Akrom et al., DFT and microkinetic investigation of oxygen reduction reaction on corrosion inhibition mechanism of iron surface by Syzygium Aromaticum extract. Appl. Surf. Sci. 615, 156319 (2023). https://doi.org/10.1016/j.apsusc.2022.156319
S. Chen, B. Zhu, X. Liang, Corrosion inhibition performance of coconut leaf extract as a green corrosion inhibitor for X65 steel in hydrochloric acid solution. Int. J. Electrochem. Sci. 15(1), 1–15 (2020). https://doi.org/10.20964/2020.01.39
M. Akrom, S. Rustad, A.G. Saputro, A. Ramelan, F. Fathurrahman, H.K. Dipojono, A combination of machine learning model and density functional theory method to predict corrosion inhibition performance of new diazine derivative compounds. Mater. Today Commun. 35, 106402 (2023). https://doi.org/10.1016/J.MTCOMM.2023.106402
T. Sutojo, S. Rustad, M. Akrom, A. Syukur, G.F. Shidik, H.K. Dipojono, A machine learning approach for corrosion small datasets. Npj Mater. Degrad. 7(1), 18 (2023). https://doi.org/10.1038/s41529-023-00336-7
B. El Ibrahimi et al., Theoretical evaluation of some α-amino acids for corrosion inhibition of copper in acidic medium: DFT calculations, Monte Carlo simulations and QSPR studies. J. King Saud Univ. Sci. 32(1), 163–171 (2020). https://doi.org/10.1016/j.jksus.2018.04.004
A. Roy, M.F.N. Taufique, H. Khakurel, R. Devanathan, D.D. Johnson, G. Balasubramanian, Machine-learning-guided descriptor selection for predicting corrosion resistance in multi-principal element alloys. Npj Mater. Degrad. 6(1), 9 (2022). https://doi.org/10.1038/s41529-021-00208-y
C.T. Ser, P. Žuvela, M.W. Wong, Prediction of corrosion inhibition efficiency of pyridines and quinolines on an iron surface using machine learning-powered quantitative structure-property relationships. Appl. Surf. Sci. 512, 145612 (2020). https://doi.org/10.1016/j.apsusc.2020.145612
R.L. Camacho-Mendoza, L. Feria, L.Á. Zárate-Hernández, J.G. Alvarado-Rodríguez, J. Cruz-Borbolla, New QSPR model for prediction of corrosion inhibition using conceptual density functional theory. J. Mol. Model 28(8), 238 (2022). https://doi.org/10.1007/s00894-022-05240-6
E.H. El Assiri et al., Development and validation of QSPR models for corrosion inhibition of carbon steel by some pyridazine derivatives in acidic medium. Heliyon 6(10), e05067 (2020). https://doi.org/10.1016/j.heliyon.2020.e05067
T.W. Quadri et al., Development of QSAR-based (MLR/ANN) predictive models for effective design of pyridazine corrosion inhibitors. Mater. Today Commun. 30, 103163 (2022). https://doi.org/10.1016/j.mtcomm.2022.103163
T.W. Quadri et al., Predicting protection capacities of pyrimidine-based corrosion inhibitors for mild steel/HCl interface using linear and nonlinear QSPR models. J. Mol. Model. 28(9), 254 (2022). https://doi.org/10.1007/s00894-022-05245-1
A.H. Alamri, N. Alhazmi, Development of data driven machine learning models for the prediction and design of pyrimidine corrosion inhibitors. J. Saudi Chem. Soc. 26(6), 101536 (2022). https://doi.org/10.1016/j.jscs.2022.101536
T.W. Quadri et al., Computational insights into quinoxaline-based corrosion inhibitors of steel in HCl: quantum chemical analysis and QSPR-ANN studies. Arab. J. Chem. 15(7), 103870 (2022). https://doi.org/10.1016/j.arabjc.2022.103870
V. Saini, Machine learning prediction of empirical polarity using SMILES encoding of organic solvents. Mol. Divers. (2022). https://doi.org/10.1007/s11030-022-10559-6
X. Li, D. Fourches, SMILES pair encoding: a data-driven substructure tokenization algorithm for deep learning. J. Chem. Inf. Model. 61(4), 1560–1569 (2021). https://doi.org/10.1021/acs.jcim.0c01127
Y. Hou, S. Wang, B. Bai, H.C. StephenChan, S. Yuan, Accurate physical property predictions via deep learning. Molecules 27(5), 1668 (2022). https://doi.org/10.3390/molecules27051668
S. Galati et al., VenomPred: a machine learning based platform for molecular toxicity predictions. Int. J. Mol. Sci. 23(4), 2105 (2022). https://doi.org/10.3390/ijms23042105
M. Akrom, S. Rustad, H.K. Dipojono, A machine learning approach to predict the efficiency of corrosion inhibition by natural product-based organic inhibitors. Physica Scripta 99(3), 036006 (2024). https://doi.org/10.1088/1402-4896/ad28a9
M. Akrom, S. Rustad, H.K. Dipojono, Prediction of anti-corrosion performance of new triazole derivatives via machine learning. Comp. and Theo. Chem. 1236, 114599 (2024). https://doi.org/10.1016/j.comptc.2024.114599
P. Gedeck, B. Rohde, C. Bartels, QSAR—how good is it in practice? Comparison of descriptor sets on an unbiased cross section of corporate data sets. J. Chem. Inf. Model. 46(5), 1924–1936 (2006). https://doi.org/10.1021/ci050413p
G. Landrum, RDKit: A software suite for cheminformatics, computational chemistry, and predictive modeling. http://rdkit.sourceforge.net
M. Akrom, S. Rustad, A.G. Saputro, H.K. Dipojono, Data-driven investigation to model the corrosion inhibition efficiency of Pyrimidine-Pyrazole hybrid corrosion inhibitors. Comput. Theoret. Chem. 1229, 114307 (2023). https://doi.org/10.1016/J.COMPTC.2023.114307
A. Botchkarev, A new typology design of performance metrics to measure errors in machine learning regression algorithms. Interdiscip. J. Inf. Knowl. Manag. 14, 45–76 (2019). https://doi.org/10.28945/4184
X. Yuan, Z. Ge, Z. Song, Soft sensor model development in multiphase/multimode processes based on Gaussian mixture regression. Chemom. Intell. Lab. Syst. 138, 97–109 (2014). https://doi.org/10.1016/j.chemolab.2014.07.013
A. Natekin, A. Knoll, Gradient boosting machines, a tutorial. Front. Neurorobot. 7, 21 (2013). https://doi.org/10.3389/fnbot.2013.00021
Z. Zhang, Introduction to machine learning: K-nearest neighbors. Ann. Transl. Med. (2016). https://doi.org/10.21037/atm.2016.03.37
A.J. Smola, B. Schölkopf, S. Schölkopf, A Tutorial on Support Vector Regression * (Kluwer Academic Publishers, Dordrecht, 2004)
F. Pedregosa FABIANPEDREGOSA et al., Scikit-learn: Machine Learning in Python Gaël Varoquaux Bertrand Thirion Vincent Dubourg Alexandre Passos PEDREGOSA, VAROQUAUX, GRAMFORT ET AL. Matthieu Perrot, 2011. http://scikit-learn.sourceforge.net.
M.F. Sanner, Python: A Programming Language for Software Integration and Development. http://www.python.org/doc/Comparisons.html
M. Akrom, S. Rustad, H. Kresno Dipojono, Machine learning investigation to predict corrosion inhibition capacity of new amino acid compounds as corrosion inhibitors. Results Chem. 6, 101126 (2023). https://doi.org/10.1016/J.RECHEM.2023.101126
G.M. Masuku, W. Nxumalo, M.M. Kabanda, L.C. Murulana, I. Bahadur, Quinoxaline derivatives as corrosion inhibitors of zinc in 1.0 M hydrochloric and sulphuric acid solutions: adsorption, electrochemical, spectroscopic, and computational studies. J. Mol. Liq. 386, 122458 (2023). https://doi.org/10.1016/j.molliq.2023.122458
Acknowledgments
All calculations were performed using the Computation Facility at the Research Center for Materials Informatics, Universitas Dian Nuswantoro, Indonesia.
Funding
Not Applicable.
Author information
Authors and Affiliations
Contributions
MA: Writing—original draft, Data collection & construction, Conceptualization, Methodology, Investigation, and Analysis; SR: Conceptualization, Review, and Supervision; HKD: Conceptualization, Review, and Supervision.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests or personal relationships that may have influenced the results reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Akrom, M., Rustad, S. & Dipojono, H.K. SMILES-based machine learning enables the prediction of corrosion inhibition capacity. MRS Communications (2024). https://doi.org/10.1557/s43579-024-00551-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1557/s43579-024-00551-6