Abstract
In this study, aquatic-quantitative structure toxicity relationship (Aqua-QSTR) models for predicting chemical toxicity in aquatic organisms is developed using the electrophilicity-based charge transfer (ECT) descriptor. Firstly, the nature of charge transfer between the selected series of chemical compounds and deoxyribonucleic acid (DNA) bases is carried out to know their electron-donating or accepting nature. Based on the nature of the interaction, Aqua-QSTR studies were carried out for Fathead minnow and Tetrahymena pyriformis using linear regression (LR), machine learning-based random forest regression (RFR), and support vector regression (SVR) methods. LR-derived QSTR on the first set of compounds against Fathead minnow based on maximum ECT values provides 87.7% variation in data with root mean square error (RMSE) of 0.145. Similarly, LR-derived Aqua-QSTR studies on the second set of compounds against Tetrahymena pyriformis based on maximum ECT values give a 90.6% variation in data with an RMSE of 0.163. Further, RFR (SVR) model provides 96.8% (87.7%) variation in data with RMSE of 0.074 (0.145) for the Fathead minnow and 98.1% (90.4%) variation in data with RMSE of 0.073 (0.165) for Tetrahymena pyriformis. The results revealed the utility of ECT in the toxicity prediction of chemical compounds in aquatic organisms.
Similar content being viewed by others
References
Chermette H (1999) Chemical reactivity indexes in density functional theory. J Comput Chem 20:129–154
Geerlings P, De Proft F, Langenaeker W (2003) Conceptual density functional theory. Chem Rev 103:1793–1874
Chattaraj PK, Sarkar U, Roy DR (2006) Electrophilicity index. Chem Rev 106:2065–2091
Padmanabhan J, Parthasarathi R, Elango M, Subramanian V, Krishnamoorthy BS, Gutierrez-Oliva S, Toro-Labbe A, Roy DR, Chattaraj PK (2007) Multiphilic descriptor for chemical reactivity and selectivity. J Phys Chem A 111:9130–9138
Padmanabhan J, Parthasarathi R, Subramanian V, Chattaraj PK (2006) Chemical information insights into the series of chloroanisoles-A theoretical approach. J Mol Struct: Theochem 774:49–57
Padmanabhan J, Parthasarathi R, Subramanian V, Chattaraj PK (2006) Theoretical study on the complete series of chloroanilines. J Phys Chem A 110:9900–9907
Roy DR, Sarkar U, Chattaraj PK, Mitra A, Padmanabhan J, Parthasarathi R, Subramanian V, Van Damme S, Bultinck P (2006) Analyzing toxicity through electrophilicity. Mol Div 10:119–131
Parthasarathi R, Padmanabhan J, Elango M, Chitra K, Subramanian V, Chattaraj PK (2006) pKa prediction using group philicity. J Phys Chem A 110:6540–6544
Roy DR, Parthasarathi R, Padmanabhan J, Sarkar U, Subramanian V, Chattaraj PK (2006) Careful scrutiny of the philicity concept. J Phys Chem A 110:1084–1093
Parthasarathi R, Elango M, Padmanabhan J, Subramanian V, Roy DR, Sarkar U, Chattaraj PK (2006) Application of quantum chemical descriptors in computational medicinal chemistry and chemoinformatics. Indian J Chem 45A:111–125
Padmanabhan J, Parthasarathi R, Subramanian V, Chattaraj PK (2007) Electrophilicity-based charge transfer descriptor. J Phys Chem A 111:1358–1361
Roy SM, Roy DR, Sahoo SK (2015) Toxicity prediction of PHDDs and phenols in the light of nucleic acidbases and DNA base pair interaction. J Mol Graph Modell 62:128–137
Ciaburro G (2018) Regression analysis with R: design and develop statistical nodes to identify unique relationships within data at scale. Packt Publishing Ltd., Birmingham, United Kingdom
Breiman L (1996) Bagging predictors. Mach Learn 24:123–140
Breiman L (2008) Random forests. Mach Learn 45:5–32
Basak D, Pal S, Ch D, Patranabis R (2007) Support vector regression neural info process. Lett and Rev 11(10):203–224
Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP (2003) Random forest: a classification and regression tool for compound classification and QSAR modeling. J Chem Inf Comput Sci 43:1947–1958
Rustam Z, Zhafarina F, Saragih GS, Hartini S (2021) Pancreatic cancer classification using logistic regression and random forest. IAES Int J Artif Intell 10:476–481
Zhou Y, Li S, Zhao Y, Guo M, Liu Y, Li M, Wen Z (2021) Quantitative structure-activity relationship (QSAR) model for the severity prediction of drug-induced rhabdomyolysis by using random forest. Chem Res Tox 34(2):514–521
Sun G, Li S, Cao Y, Lang F (2017) Cervical cancer diagnosis based on random forest. Int J Perform Eng 13:446–457
Dai B, Chen R, Zhu S, Zhang W (2018) Using random forest algorithm for breast cancer diagnosis. In Proceedings of the international symposium on computer, consumer and control (IS3C) Taichung Taiwan 6–8 December 449–452
Fang X, Liu W, Ai J, He M, Wu Y, Shi Y, Shen W, Bao C (2020) Forecasting incidence of infectious diarrhea using random forest in Jiangsu Province China. BMC Infect Dis 20:1–8
Kamal S, Urata J, Cavassini M, Liu H, Kouyos R, Bugnon O, Wang W, Schneider M (2020) Random forest machine learning algorithm predicts virologic outcomes among HIV infected adults in Lausanne, Switzerland using electronically monitored combined antiretroviral treatment adherence AIDS. Care 33:530–560
Moorthy K, Mohamad M (2011) Random Forest for gene selection and microarray data classification In: Proceedings of the third knowledge technology week, Kajang, Malaysia. 18–22 July 174–183
Anaissi A, Kennedy PJ, Goyal M, Catchpoole DR (2013) A balanced iterative random forest for gene selection from microarray data. BMC Bioinform 14:1–10
Wu CH, Ho JM, Lee DT (2004) Travel-time prediction with support vector regression. IEEE 5(4):276–281
Jain RK, Smith KM, Culligan PJ, Taylor JE (2014) Forecasting energy consumption of multi-family residential buildings using support vector regression: Investigating the impact of temporal and spatial monitoring granularity on performance accuracy. Appl Energy 123:168–178
Moguerza JM, Muñoz A, Psarakis S (2007) Monitoring nonlinear profiles using support vector machines. CIARP lecture notes in computer science 4756:574–583 Springer
Thissen U, Pepers M, Üstön B, Melssen WJ, Buydens LMC (2004) Comparing support vector machines to PLS for spectral regression applications. Chemom Intell Lab Syst 73(2):169–179
Mei H, Zhou Y, Liang G, Li ZL (2005) Support vector machine applied in QSAR modelling. Chin Sci Bull 50(20):2291–2296
Huang M, Wei Y, Wang J, Zhang Y (2016) Support vector regression-guided unravelling: antioxidant capacity and quantitative structure-activity relationship predict reduction and promotion effects of flavonoids on acrylamide formation. Sci Rep 6(1):32368–32382
Yang X, Li M, Su Q, Wu M, Gu T, Lu W (2013) QSAR studies on pyrrolidine amides derivatives as DPP-IV inhibitors for type 2 diabetes. Med Chem Res 22(11):5274–5283
He L, Jurs PC (2005) Assessing the reliability of a QSAR model’s predictions. J Mol Graph Modell 23:503–523
Elidrissi B, Ousaa A, Ghamali M, Chtita S, Ajana MA, Bouachrine M, Lakhlifi T (2015) The acute toxicity of nitrobenzenes to tetrahymena pyriformis: combining DFT and QSAR studies. Mor J Chem 3:848–860
Parr RG, Szentpaly LV, Liu S (1999) Electrophilicity index. J Am Chem Soc 121:1922–1924
Becke AD (1988) Density-functional exchange-energy approximation with correct asymptotic behavior. Phys Rev A 38:3098–3100
Lee C, Yang W, Parr RG (1988) Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density. Phys Rev B 37:785–789
Hariharan PC, Pople JA (1973) The influence of polarization functions on molecular orbital hydrogenation energies. Theor Chim Acta 28:213–222
Frisch MJ, Trucks GW, Schlegel HB, Scuseria GE, Robb MA, Cheeseman JR, Scalmani G, Barone V, Petersson GA, Nakatsuji H, Li X, Caricato M, Marenich AV, Bloino J, Janesko BG, Gomperts R, Mennucci B, Hratchian HP, Ortiz JV, Izmaylov AF, Sonnenberg JL, Williams-Young D, Ding F, Lipparini F, Egidi F, Goings J, Peng B, Petrone A, Henderson T, Ranasinghe D, Zakrzewski VG, Gao J, Rega N, Zheng G, Liang W, Hada M, Ehara M, Toyota K, Fukuda R, Hasegawa J, Ishida M, Nakajima T, Honda Y, Kitao O, Nakai H, Vreven T, Throssell K, Montgomery JA Jr, Peralta JE, Ogliaro F, Bearpark MJ, Heyd JJ, Brothers EN, Kudin KN, Staroverov VN, Keith TA, Kobayashi R, Normand J, Raghavachari K, Rendell AP, Burant JC, Iyengar SS, Tomasi J, Cossi M, Millam JM, Klene M, Adamo C, Cammi R, Ochterski JW, Martin RL, Morokuma K, Farkas O, Foresman JB, Fox DJ (2016) Gaussian 16 Revision A.03. Gaussian Inc Wallingford CT
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
Project Jupyter, an open-source software. https://jupyter.org/
Acknowledgements
The authors thank the Department of Science and Technology (DST), New Delhi, India, and the Council of Scientific & Industrial Research (CSIR), New Delhi, India, for providing financial support. Authors thank Professor Paul W Ayers, Professor Frank De Proft, Professor Shubin Liu, Professor Utpal Sarkar, and Professor Alejandro Toro-Labbe for their efforts on “Festschrift” in honor of Professor Pratim Kumar Chattaraj, on the occasion of his 65th birthday. CSIR-IITR Communication no. IITR/SEC/MS/2023/05.
Author information
Authors and Affiliations
Contributions
ZA and PS performed the study, tabulated the results, and wrote the manuscript draft. JP did the conceptualization, compilation, tables/figures preparation and draft writing. The work was conceptualized, edited, and supervised by RP. All authors reviewed the manuscript.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no competing financial interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
214_2023_2977_MOESM1_ESM.docx
Supplementary file 1 Tables S1-S6 provide the codes and their output for LR, RFR and SVR algorithms as implemented using the Scikit-learn (machine learning in Python) [40] in Jupyter notebook (version 5.7.8).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Arif, Z., Singh, P., Parthasarathi, R. et al. Electrophilicity-based charge transfer for developing aquatic-quantitative structure toxicity relationships (Aqua-QSTR). Theor Chem Acc 142, 38 (2023). https://doi.org/10.1007/s00214-023-02977-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00214-023-02977-y