A deep neural network–based approach for prediction of mutagenicity of compounds

Kumar, Rajnish; Khan, Farhat Ullah; Sharma, Anju; Siddiqui, Mohammed Haris; Aziz, Izzatdin BA; Kamal, Mohammad Amjad; Ashraf, Ghulam Md; Alghamdi, Badrah S.; Uddin, Md. Sahab

doi:10.1007/s11356-021-14028-9

A deep neural network–based approach for prediction of mutagenicity of compounds

Research Article
Published: 24 April 2021

Volume 28, pages 47641–47650, (2021)
Cite this article

Environmental Science and Pollution Research Aims and scope Submit manuscript

Rajnish Kumar¹,
Farhat Ullah Khan²,
Anju Sharma³,
Mohammed Haris Siddiqui⁴,
Izzatdin BA Aziz²,
Mohammad Amjad Kamal^5,6,7,
Ghulam Md Ashraf^8,9,
Badrah S. Alghamdi^8,10 &
…
Md. Sahab Uddin ORCID: orcid.org/0000-0002-0805-7840^11,12

996 Accesses
19 Citations
14 Altmetric
1 Mention
Explore all metrics

Abstract

We are exposed to various chemical compounds present in the environment, cosmetics, and drugs almost every day. Mutagenicity is a valuable property that plays a significant role in establishing a chemical compound’s safety. Exposure and handling of mutagenic chemicals in the environment pose a high health risk; therefore, identification and screening of these chemicals are essential. Considering the time constraints and the pressure to avoid laboratory animals’ use, the shift to alternative methodologies that can establish a rapid and cost-effective detection without undue over-conservation seems critical. In this regard, computational detection and identification of the mutagens in environmental samples like drugs, pesticides, dyes, reagents, wastewater, cosmetics, and other substances is vital. From the last two decades, there have been numerous efforts to develop the prediction models for mutagenicity, and by far, machine learning methods have demonstrated some noteworthy performance and reliability. However, the accuracy of such prediction models has always been one of the major concerns for the researchers working in this area. The mutagenicity prediction models were developed using deep neural network (DNN), support vector machine, k-nearest neighbor, and random forest. The developed classifiers were based on 3039 compounds and validated on 1014 compounds; each of them encoded with 1597 molecular feature vectors. DNN-based prediction model yielded highest prediction accuracy of 92.95% and 83.81% with the training and test data, respectively. The area under the receiver’s operating curve and precision-recall curve values were found to be 0.894 and 0.838, respectively. The DNN-based classifier not only fits the data with better performance as compared to traditional machine learning algorithms, viz., support vector machine, k-nearest neighbor, and random forest (with and without feature reduction) but also yields better performance metrics. In current work, we propose a DNN-based model to predict mutagenicity of compounds.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine Learning and Deep Learning Applications to Evaluate Mutagenicity

MutagenPred-GCNNs: A Graph Convolutional Neural Network-Based Classification Model for Mutagenicity Prediction with Data-Driven Molecular Fingerprints

Article 27 January 2021

Prediction of Micronucleus Assay Outcome Using In Vivo Activity Data and Molecular Structure Features

Article 20 October 2021

Availability of data and material

Not applicable.

References

Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, et al (2016) TensorFlow: large-scale machine learning on heterogeneous distributed systems arXiv: 1603.04467. https://arxiv.org/abs/1603.04467
Alipanahi B, Delong A, Weirauch MT, Frey BJ (2015) Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat Biotechnol 33(8):831–838. https://doi.org/10.1038/nbt.3300
Article CAS Google Scholar
Alpaydin E. Introduction to machine learning. 3rd ed. MIT press; 2014.
alvaDesc. Accessed: 15 October 2020. Available at: https://www.alvascience.com/alvadesc, Accessed on 24.07.2019
Bastien F, Lamblin P, Paseanu R, Bergstra J, Goodfellow I et al (2012) Theano: new features and speed improvements. arXiv:1211.5590. https://arxiv.org/abs/1211.5590
Bhagat HA, Compton SA, Musso DL, Laudeman CP, Jackson KMP, Yi NY, Nierobisz LS, Forsberg L, Brenman JE, Sexton JZ (2018) N-substituted phenylbenzamides of the niclosamide chemotype attenuate obesity related changes in high fat diet fed mice. PLoS One 13(10):e0204605. https://doi.org/10.1371/journal.pone.0204605
Article CAS Google Scholar
Bower JH, Bolouri H (2004) Computational modeling of genetic and biochemical networks MIT Press 390.
Bryce SM, Bernacki DT, Smith-Roe SL, Witt KL, Bemis JC, Dertinger SD (2018) Investigating the generalizability of the MultiFlow ® DNA damage assay and several companion machine learning models with a set of 103 diverse test chemicals. Toxicol Sci 162(1):146–166. https://doi.org/10.1093/toxsci/kfx235
Article CAS Google Scholar
Cao C, Liu F, Tan H, Song D, Shu W, Li W, Zhou Y, Bo X, Xie Z (2018) Deep Learning and Its Applications in Biomedicine. Genomics Proteomics Bioinformatics 16(1):17–32. https://doi.org/10.1016/j.gpb.2017.07.003
Article Google Scholar
Di Lena P, Nagata K, Baldi P (2012) Deep architectures for protein contact map prediction. Bioinformatics 28(19):2449–2457. https://doi.org/10.1093/bioinformatics/bts475
Article CAS Google Scholar
Ding YL, Lyu YC, Leong MK (2017) In silico prediction of the mutagenicity of nitroaromatic compounds using a novel two-QSAR approach. Toxicol in Vitro 40:102–114. https://doi.org/10.1016/j.tiv.2016.12.013
Article CAS Google Scholar
Dong Y, Li D (2011) Deep learning and its applications to signal and information processing. IEEE Signal Process Mag 28(1):145–154. https://doi.org/10.1109/MSP.2010.939038
Article Google Scholar
Eickholt J, Cheng J (2013) DNdisorder: predicting protein disorder using boosting and deep networks. BMC Bioinformatics 14:88. https://doi.org/10.1186/1471-2105-14-88
Article CAS Google Scholar
Ford KA, Ryslik G, Chan BK, Lewin-Koh SC, Almeida D, Stokes M, Gomez SR (2017) Comparative evaluation of 11 in silico models for the prediction of small molecule mutagenicity: role of steric hindrance and electron-withdrawing groups. Toxicol Mech Methods 27(1):24–35. https://doi.org/10.1080/15376516.2016.1174761
Article CAS Google Scholar
Fukushima K (1980) Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36:193–202. https://doi.org/10.1007/BF00344251
Article CAS Google Scholar
Goh GB, Hodas NO, Vishnu A (2017) Deep learning for computational chemistry. J Comput Chem 38(16):1291–1307. https://doi.org/10.1002/jcc.24764
Article CAS Google Scholar
Guan D, Fan K, Spence I, Matthews S (2018) QSAR ligand dataset for modelling mutagenicity, genotoxicity, and rodent carcinogenicity. Data Brief 17:876–884. https://doi.org/10.1016/j.dib.2018.01.077
Article Google Scholar
Hao Y, Sun G, Fan T, Sun X, Liu Y, Zhang N, Zhao L, Zhong R, Peng Y (2019) Prediction on the mutagenicity of nitroaromatic compounds using quantum chemistry descriptors based QSAR and machine learning derived classification methods. Ecotoxicol Environ Saf 186:109822. https://doi.org/10.1016/j.ecoenv.2019.109822
Article CAS Google Scholar
Haranosono Y, Ueoka H, Kito G, Nemoto S, Kurata M, Sakaki H (2018) A reaction mechanism-based prediction of mutagenicity: α-halo carbonyl compounds adduct with DNA by SN2 reaction. J Toxicol Sci 43(3):203–211. https://doi.org/10.2131/jts.43.203
Article CAS Google Scholar
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507. https://doi.org/10.1126/science.1127647
Article CAS Google Scholar
Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554. https://doi.org/10.1162/neco.2006.18.7.1527
Article Google Scholar
Honma M (2020) An assessment of mutagenicity of chemical substances by (quantitative) structure-activity relationship. Genes Environ 42:23. https://doi.org/10.1186/s41021-020-00163-1
Article Google Scholar
Hsu KH, Su BH, Tu YS, Lin OA, Tseng YJ (2016) Mutagenicity in a molecule: identification of core structural features of mutagenicity using a scaffold analysis. PLoS One 11(2):e0148900. https://doi.org/10.1371/journal.pone.0148900
Article CAS Google Scholar
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift arXiv:1502.03167v3. https://arxiv.org/abs/1502.03167
Kazius J, McGuire R, Bursi R (2005) Derivation and validation of toxicophores for mutagenicity prediction. J Med Chem 48(1):312–320. https://doi.org/10.1021/jm040835a
Article CAS Google Scholar
Kuhnke L, Ter Laak A, Göller AH (2019) Mechanistic reactivity descriptors for the prediction of ames mutagenicity of primary aromatic amines. J Chem Inf Model 59(2):668–672. https://doi.org/10.1021/acs.jcim.8b00758
Article CAS Google Scholar
Kumar R, Sharma A, Varadwaj P et al (2011) Classification of oral bioavailability of drugs by machine learning approaches. J Comp Int Sci 2(3):1–18. https://doi.org/10.6062/jcis.2011.02.03.0045
Article Google Scholar
Kumar R, Sharma A, Siddiqui MH, Tiwari RK (2016) Prediction of metabolism of drugs using artificial intelligence: how far have we reached? Curr Drug Metab 17(2):129–141. https://doi.org/10.2174/1389200216666151103121352
Article CAS Google Scholar
Kumar R, Sharma A, Siddiqui MH, Tiwari RK (2017) Prediction of human intestinal absorption of compounds using artificial intelligence techniques. Curr Drug Discov Technol 14(4):244–254. https://doi.org/10.2174/1570163814666170404160911
Article CAS Google Scholar
Kumar R, Sharma A, Siddiqui MH, Tiwari RK (2018a) Promises of machine learning approaches in prediction of absorption of compounds. Mini-Rev Med Chem 2018;18(3):196-207. https://doi.org/10.2174/1389557517666170315150116
Kumar R, Sharma A, Siddiqui MH, Tiwari RK (2018b) Prediction of drug-plasma protein binding using artificial intelligence based algorithms. Comb Chem High Throughput Screen 21(1):57–64. https://doi.org/10.2174/1386207321666171218121557
Article CAS Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature. 521(7553):436–444. https://doi.org/10.1038/nature14539
Article CAS Google Scholar
Leong MK, Lin SW, Chen HB, Tsai FY (2010) Predicting mutagenicity of aromatic amines by various machine learning approaches. Toxicol Sci 116(2):498–513. https://doi.org/10.1093/toxsci/kfq159
Article CAS Google Scholar
Lu J, Zhang P, Zou XW, Zhao XQ, Cheng KG, Zhao YL, Bi Y, Zheng MY, Luo XM (2017) In silico prediction of chemical toxicity profile using local lazy learning. Comb Chem High Throughput Screen 20(4):346–353. https://doi.org/10.2174/1386207320666170217151826
Article CAS Google Scholar
Maron DM, Ames BN (1983) Revised methods for Salmonella mutagenicity test. Mutat Res 113:173–215. https://doi.org/10.1016/0165-1161(83)90010-9
Article CAS Google Scholar
Mayr A, Klambauer G, Unterthiner T, Hochreiter S (2016) DeepTox: toxicity prediction using deep learning. Front Environ Sc 3:80. https://doi.org/10.3389/fenvs.2015.00080
Article Google Scholar
Mombelli E, Raitano G, Benfenati E (2016) In silico prediction of chemically induced mutagenicity: how to use QSAR models and interpret their results. Methods Mol Biol 1425:87–105. https://doi.org/10.1007/978-1-4939-3609-0_5
Article CAS Google Scholar
Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10) pp.807-814
Norinder U, Myatt G, Ahlberg E et al (2018) Predicting aromatic amine mutagenicity with confidence: a case study using conformal prediction. Biomolecules 8(3):85. https://doi.org/10.3390/biom8030085
Article CAS Google Scholar
Norinder U, Ahlberg E, Carlsson L (2019) Predicting Ames mutagenicity using conformal prediction in the Ames/QSAR international challenge project. Mutagenesis 34(1):33–40. https://doi.org/10.1093/mutage/gey038
Article CAS Google Scholar
Rim KT, Kim SJ (2015) A review on mutagenicity testing for hazard classification of chemicals at work: focusing on in vivo micronucleus test for allyl chloride. Saf Health Work 6(3):184–191. https://doi.org/10.1016/j.shaw.2015.05.005
Article Google Scholar
Saxena D, Sharma A, Siddiqui MH, Kumar R (2019) Blood brain barrier permeability prediction using machine learning techniques: an update. Curr Pharm Biotechnol 20(14):1163–1171. https://doi.org/10.2174/1389201020666190821145346
Article CAS Google Scholar
Sharma A, Kumar R, Varadwaj PK, Ahmad A, Ashraf GM (2011) A comparative study of support vector machine, artificial neural network and bayesian classifier for mutagenicity prediction. Interdiscip Sci 3(3):232–239. https://doi.org/10.1007/s12539-011-0102-9
Article CAS Google Scholar
Sharma A, Kumar R, Semwal R, Aier I, Varadwaj P (2020) DeepOlf: deep neural network based architecture for predicting odorants and their interacting olfactory receptors. IEEE/ACM Trans Comput Biol Bioinform 1:1. https://doi.org/10.1109/TCBB.2020.3002154
Article Google Scholar
Sharma A, Kumar R, Ranjta S, Varadwaj PK (2021) SMILES to smell: decoding the structure-odor relationship of chemical compounds using the deep neural network approach. J Chem Inf Model 61(2):676–688. https://doi.org/10.1021/acs.jcim.0c01288
Article CAS Google Scholar
Spencer M, Eickholt J, Cheng J (2015) A deep learning network approach to ab initio protein secondary structure prediction. IEEE/ACM Trans Comput Biol Bioinform 12(1):103–112. https://doi.org/10.1109/TCBB.2014.2343960
Article CAS Google Scholar
Tianqi C, Mu L, Li Y et al (2015) MXNet: A flexible and efficient machine learning library for heterogeneous distributed systems. In: Neural information processing systems, Workshop on Machine Learning Systems arXiv:1512.01274. https://arxiv.org/abs/1512.01274
Van Bossuyt M, Van Hoeck E, Raitano G, Vanhaecke T, Benfenati E et al (2018) Performance of in silico models for mutagenicity prediction of food contact materials. Toxicol Sci 163(2):632–638. https://doi.org/10.1093/toxsci/kfy057
Article CAS Google Scholar
Webb SJ, Hanser T, Howlin B, Krause P, Vessey JD (2014) Feature combination networks for the interpretation of statistical machine learning models: application to Ames mutagenicity. Aust J Chem 6(1):8. https://doi.org/10.1186/1758-2946-6-8
Article CAS Google Scholar
Xu C, Cheng F, Chen L, Du Z, Li W et al (2012) In silico prediction of chemical Ames mutagenicity. J Chem Inf Model 52(11):2840–2847. https://doi.org/10.1021/ci300400a
Article CAS Google Scholar
Yangqing J, Evan S, Jeff D et al (2014) Caffe: convolutional architecture for fast feature embedding. arXiv 1408.5093. https://arxiv.org/abs/1408.5093
Zhang H, Kang YL, Zhu YY, Zhao KX, Liang JY, Ding L, Zhang TG, Zhang J (2017) Novel naïve Bayes classification models for predicting the chemical Ames mutagenicity. Toxicol in Vitro 41:56–63. https://doi.org/10.1016/j.tiv.2017.02.016
Article CAS Google Scholar
Zhou J, Troyanskaya OG (2015) Predicting effects of noncoding variants with deep learning-based sequence model. Nat Methods 12:931–934. https://doi.org/10.1038/nmeth.3547
Article CAS Google Scholar

Download references

Author information

Authors and Affiliations

Amity Institute of Biotechnology, Amity University Uttar Pradesh, Lucknow Campus, Lucknow, Uttar Pradesh, India
Rajnish Kumar
Computer and Information Sciences Department, Universiti Teknologi Petronas, 32610, Seri Iskander, Perak, Malaysia
Farhat Ullah Khan & Izzatdin BA Aziz
Department of Applied Science, Indian Institute of Information Technology, Allahabad, Uttar Pradesh, India
Anju Sharma
Department of Bioengineering, Integral University, Dasauli, P.O. Basha, Kursi Road, Lucknow, Uttar Pradesh, India
Mohammed Haris Siddiqui
West China School of Nursing / Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu 610041, Sichuan, China
Mohammad Amjad Kamal
King Fahd Medical Research Center, King Abdulaziz University, P. O. Box 80216, Jeddah 21589, Saudi Arabia
Mohammad Amjad Kamal
Enzymoics, Novel Global Community Educational Foundation, Hebersham, New South Wales, Australia
Mohammad Amjad Kamal
Pre-Clinical Research Unit, King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
Ghulam Md Ashraf & Badrah S. Alghamdi
Department of Medical Laboratory Technology, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
Ghulam Md Ashraf
Department of Physiology, Neuroscience Unit, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia
Badrah S. Alghamdi
Department of Pharmacy, Southeast University, Dhaka, Bangladesh
Md. Sahab Uddin
Pharmakon Neuroscience Research Network, Dhaka, Bangladesh
Md. Sahab Uddin

Authors

Rajnish Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Farhat Ullah Khan
View author publications
You can also search for this author in PubMed Google Scholar
Anju Sharma
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Haris Siddiqui
View author publications
You can also search for this author in PubMed Google Scholar
Izzatdin BA Aziz
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Amjad Kamal
View author publications
You can also search for this author in PubMed Google Scholar
Ghulam Md Ashraf
View author publications
You can also search for this author in PubMed Google Scholar
Badrah S. Alghamdi
View author publications
You can also search for this author in PubMed Google Scholar
Md. Sahab Uddin
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

RK, FUK, and AS designed the study and prepared the draft of the manuscript. MHS and IBAA performed the literature review and aided in revising the manuscript. MAK, GMA, BSA, and MSU edited the whole manuscript and improved the draft. All authors read and approved the final submitted version of the manuscript.

Corresponding authors

Correspondence to Rajnish Kumar, Ghulam Md Ashraf or Md. Sahab Uddin.

Ethics declarations

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Conflict of interest

The authors declare no conflict of interest.

Additional information

Responsible Editor: Ludek Blaha

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

ESM 1

(DOCX 1122 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kumar, R., Khan, F.U., Sharma, A. et al. A deep neural network–based approach for prediction of mutagenicity of compounds. Environ Sci Pollut Res 28, 47641–47650 (2021). https://doi.org/10.1007/s11356-021-14028-9

Download citation

Received: 17 February 2021
Accepted: 16 April 2021
Published: 24 April 2021
Issue Date: September 2021
DOI: https://doi.org/10.1007/s11356-021-14028-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A deep neural network–based approach for prediction of mutagenicity of compounds

Abstract

Access this article

Similar content being viewed by others

Machine Learning and Deep Learning Applications to Evaluate Mutagenicity

MutagenPred-GCNNs: A Graph Convolutional Neural Network-Based Classification Model for Mutagenicity Prediction with Data-Driven Molecular Fingerprints

Prediction of Micronucleus Assay Outcome Using In Vivo Activity Data and Molecular Structure Features

Availability of data and material

References

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Consent to participate

Consent for publication

Conflict of interest

Additional information

Publisher’s Note

Supplementary Information

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A deep neural network–based approach for prediction of mutagenicity of compounds

Abstract

Access this article

Similar content being viewed by others

Machine Learning and Deep Learning Applications to Evaluate Mutagenicity

MutagenPred-GCNNs: A Graph Convolutional Neural Network-Based Classification Model for Mutagenicity Prediction with Data-Driven Molecular Fingerprints

Prediction of Micronucleus Assay Outcome Using In Vivo Activity Data and Molecular Structure Features

Availability of data and material

References

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Consent to participate

Consent for publication

Conflict of interest

Additional information

Publisher’s Note

Supplementary Information

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation