MutagenPred-GCNNs: A Graph Convolutional Neural Network-Based Classification Model for Mutagenicity Prediction with Data-Driven Molecular Fingerprints

Li, Shimeng; Zhang, Li; Feng, Huawei; Meng, Jinhui; Xie, Di; Yi, Liwei; Arkin, Isaiah T.; Liu, Hongsheng

doi:10.1007/s12539-020-00407-2

MutagenPred-GCNNs: A Graph Convolutional Neural Network-Based Classification Model for Mutagenicity Prediction with Data-Driven Molecular Fingerprints

Original research article
Published: 27 January 2021

Volume 13, pages 25–33, (2021)
Cite this article

Interdisciplinary Sciences: Computational Life Sciences Aims and scope Submit manuscript

Shimeng Li¹^na1,
Li Zhang^1,2,3^na1,
Huawei Feng¹,
Jinhui Meng¹,
Di Xie¹,
Liwei Yi⁴,
Isaiah T. Arkin⁵ &
…
Hongsheng Liu ORCID: orcid.org/0000-0001-9242-6508^1,2,3

1049 Accesses
18 Citations
6 Altmetric
1 Mention
Explore all metrics

Abstract

An important task in the early stage of drug discovery is the identification of mutagenic compounds. Mutagenicity prediction models that can interpret relationships between toxicological endpoints and compound structures are especially favorable. In this research, we used an advanced graph convolutional neural network (GCNN) architecture to identify the molecular representation and develop predictive models based on these representations. The predictive model based on features extracted by GCNNs can not only predict the mutagenicity of compounds but also identify the structure alerts in compounds. In fivefold cross-validation and external validation, the highest area under the curve was 0.8782 and 0.8382, respectively; the highest accuracy (Q) was 80.98% and 76.63%, respectively; the highest sensitivity was 83.27% and 78.92%, respectively; and the highest specificity was 78.83% and 76.32%, respectively. Additionally, our model also identified some toxicophores, such as aromatic nitro, three-membered heterocycles, quinones, and nitrogen and sulfur mustard. These results indicate that GCNNs could learn the features of mutagens effectively. In summary, we developed a mutagenicity classification model with high predictive performance and interpretability based on a data-driven molecular representation trained through GCNNs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Article 18 August 2021

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

Artificial intelligence to deep learning: machine intelligence approach for drug discovery

Article 12 April 2021

References

Parasuraman S (2011) Toxicological screening. J Pharmacol Pharmacother 2(2):74–79. https://doi.org/10.4103/0976-500X.81895
Article CAS PubMed PubMed Central Google Scholar
Segall MD, Chris B (2014) Addressing toxicity risk when designing and selecting compounds in early drug discovery. Drug Discov Today 19(5):688–693. https://doi.org/10.1016/j.drudis.2014.01.006
Article CAS PubMed Google Scholar
Ames BN, Lee FD, Durston WE (1973) An improved bacterial test system for the detection and classification of mutagens and carcinogens. PNAS 70(6):1903–1903. https://doi.org/10.1073/pnas.70.3.782
Article Google Scholar
Hillebrecht A, Muster W, Brigo A, Kansy M, Weiser T, Singer T (2011) Comparative evaluation of in silico systems for ames test mutagenicity prediction: scope and limitations. Chem Res Toxicol 24(6):843–854. https://doi.org/10.1021/tx2000398
Article CAS PubMed Google Scholar
Lhasa Ltd. L, UK DEREK for Windows. http://www.lhasalimited.org
Leadscope Inc. C, OH. Leadscope Model Applier. http://www.leadscope.com
MultiCASE Inc. B, OH. MultiCASE. http://www.multicase.com
Saiakhov RD, Chakravarti S, Fuller MA, Klopman G (2011) Case ultra: an expert system for computational toxicology with a novel approach for improving risk assessment of chemicals. Toxicol Lett. https://doi.org/10.1016/j.toxlet.2011.05.355
Article Google Scholar
Saiakhov R, Chakravarti S, Klopman G (2013) Effectiveness of CASE ultra expert system in evaluating adverse effects of drugs. Mol Inform 32(1):87–97. https://doi.org/10.1002/minf.201200081
Article CAS PubMed Google Scholar
Benigni R, Bossa C, Tcheremenskaia O (2013) Nongenotoxic carcinogenicity of chemicals: mechanisms of action and early recognition through a new set of structural alerts. Chem Rev 113(5):2940–2957. https://doi.org/10.1021/cr300206t
Article CAS PubMed Google Scholar
Zhang J, Mucs D, Norinder U, Svensson F (2019) LightGBM: an effective and scalable algorithm for prediction of chemical toxicity-application to the Tox21 and mutagenicity datasets. J Chem Inf Model 59(10):4150–4158. https://doi.org/10.1021/acs.jcim.9b00633
Article CAS PubMed Google Scholar
Priyanka B, Eckert AO, Schrey AK, Robert P (2018) ProTox-II: a webserver for the prediction of toxicity of chemicals. Nucleic Acids Res 46(W1):W257–W263. https://doi.org/10.1093/nar/gky318
Article CAS Google Scholar
Hongbin Y, Chaofeng L, Lixia S, Jie L, Yingchun C (2019) admetSAR 2.0: web-service for prediction and optimization of chemical ADMET properties. Bioinformatics 35(6):1067–1069. https://doi.org/10.1093/bioinformatics/bty707
Article CAS Google Scholar
Aliper A, Plis S, Artemov A, Ulloa A, Mamoshina P, Zhavoronkov A (2016) Deep learning applications for predicting pharmacological properties of drugs and drug repurposing using transcriptomic data. Mol Pharm 13(7):2524–2530. https://doi.org/10.1021/acs.molpharmaceut.6b00248
Article CAS PubMed PubMed Central Google Scholar
Pan Z, Yu W, Yi X, Khan A, Yuan F, Zheng Y (2019) Recent progress on generative adversarial networks (GANs): a survey. IEEE Access 7:36322–36333. https://doi.org/10.1109/ACCESS.2019.2905015
Article Google Scholar
Khan A, Sohail A, Zahoora U, Qureshi AS (2020) A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev 53:5455–5516. https://doi.org/10.1007/s10462-020-09825-6
Article Google Scholar
Ma J, Sheridan RP, Liaw A, Dahl GE, Svetnik V (2015) Deep neural nets as a method for quantitative structure–activity relationships. J Chem Inf Model 55(2):263–274. https://doi.org/10.1021/ci500747n
Article CAS PubMed Google Scholar
Mayr A, Klambauer G, Unterthiner T, Hochreiter S (2016) DeepTox: toxicity prediction using deep learning. Front environ sci 3:80. https://doi.org/10.3389/fenvs.2015.00080
Article Google Scholar
Koutsoukas A, Monaghan KJ, Li X, Huan J (2017) Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data. J Cheminform 9(1):42. https://doi.org/10.1186/s13321-017-0226-y
Article PubMed PubMed Central Google Scholar
Todeschini R, Consonni V, Mannhold R, Kubinyi H, Folkers G (2009) Molecular descriptors for chemoinformatics: volume I: alphabetical listing/volume II: appendices references. Methods and principles in medicinal chemistry. Wiley, Hoboken. https://doi.org/10.1002/9783527628766
Book Google Scholar
Shen J, Cheng F, Xu Y, Li W, Tang Y (2010) Estimation of ADME properties with substructure pattern recognition. J Chem Inf Model 50(6):1034–1041. https://doi.org/10.1021/ci100104j
Article CAS PubMed Google Scholar
Cw YAP (2010) Software news and update PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32(7):1466–1474. https://doi.org/10.1002/jcc.21707
Article CAS Google Scholar
Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50(5):742–754. https://doi.org/10.1021/ci100050t
Article CAS PubMed Google Scholar
Morgan H (1965) The generation of a unique machine description for chemical structures—a technique developed at chemical abstracts service. J Chem Doc 5(2):107–113. https://doi.org/10.1021/c160017a018
Article CAS Google Scholar
Li H, Liang Y, Xu Q (2009) Support vector machines and its applications in chemistry. Chemom Intell Lab Syst 95(2):188–198. https://doi.org/10.1016/j.chemolab.2008.10.007
Article CAS Google Scholar
Hautier G, Fischer CC, Jain A, Mueller T, Ceder G (2010) Finding nature’s missing ternary oxide compounds using machine learning and density functional theory. Chem Mater 22(12):3762–3767. https://doi.org/10.1021/cm100795d
Article CAS Google Scholar
Müller K-R, Rätsch G, Sonnenburg S, Mika S, Grimm M, Heinrich N (2005) Classifying ‘drug-likeness’ with kernel-based learning methods. J Chem Inf Model 45(2):249–253. https://doi.org/10.1021/ci049737o
Article CAS PubMed Google Scholar
Bartók AP, Gillan MJ, Manby FR, Csányi G (2013) Machine-learning approach for one-and two-body corrections to density functional theory: applications to molecular and condensed water. Phys Rev B 88(5):054104. https://doi.org/10.1103/PhysRevB.88.054104
Article CAS Google Scholar
Preuer K, Klambauer G, Rippmann F, Hochreiter S, Unterthiner T (2019) Interpretable deep learning in drug discovery. Explainable AI: interpreting, explaining and visualizing deep learning. Springer, Berlin. https://doi.org/10.1007/978-3-030-28954-6_18
Book Google Scholar
Bruna J, Zaremba W, Szlam A, LeCun Y (2013) Spectral networks and locally connected networks on graphs. https://arxiv.org/abs/1312.6203
Masci J, Boscaini D, Bronstein MM, Vandergheynst P (2015) Geodesic convolutional neural networks on Riemannian manifolds. In: 2015 IEEE international conference on computer vision workshop (ICCVW), Santiago, pp 832–840. https://doi.org/10.1109/ICCVW.2015.112
Duvenaud DK, Maclaurin D, Iparraguirre J, Bombarell R, Hirzel T, Aspuru-Guzik A, Adams RP (2015) Convolutional networks on graphs for learning molecular fingerprints. In: Advances in neural information processing systems, vol 28. Curran Associates, Inc. http://papers.nips.cc/paper/5954-convolutional-networks-on-graphs-for-learning-molecular-fingerprints.pdf
Goh GB, Siegel C, Vishnu A, Hodas NO, Baker N (2017) Chemception: a deep neural network with minimal chemistry knowledge matches the performance of expert-developed QSAR/QSPR models. https://arxiv.org/abs/1706.06689
Katja H, Sebastian M, Timon S, Andreas S, Antonius TL, Thomas SH, Nikolaus H, Klaus-Robert M (2009) Benchmark data set for in silico prediction of Ames mutagenicity. J Chem Inf Model 49(9):2077–2081. https://doi.org/10.1021/ci900161g
Article CAS Google Scholar
Kazius J, McGuire R, Bursi R (2005) Derivation and validation of toxicophores for mutagenicity prediction. J Med Chem 48(1):312–320. https://doi.org/10.1021/jm040835a
Article CAS PubMed Google Scholar
Helma C, Cramer T, Kramer S, De Raedt L (2004) Data mining and machine learning techniques for the identification of mutagenicity inducing substructures and structure activity relationships of noncongeneric compounds. J Chem Inf Comput Sci 44(4):1402–1411. https://doi.org/10.1021/ci034254q
Article CAS PubMed Google Scholar
Feng J, Lurati L, Ouyang H, Robinson T, Wang Y, Yuan S, Young SS (2003) Predictive toxicology: benchmarking molecular descriptors and statistical methods. J Chem Inf Comput Sci 43(5):1463–1470. https://doi.org/10.1021/ci034032s
Article CAS PubMed Google Scholar
Landrum G (2016) RDKit: open-source cheminformatics software. https://www.rdkit.org/
Bergstra J, Yamins D, Cox D (2013) Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In: Sanjoy D, David M (eds) Proceedings of the 30th international conference on machine learning, vol 1. PMLR, pp 115–123. https://doi.org/10.5555/3042817.3042832
Xu C, Cheng F, Chen L, Du Z, Li W, Liu G, Lee PW, Tang Y (2012) In silico prediction of chemical ames mutagenicity. J Chem Inf Model 52(11):2840–2847. https://doi.org/10.1021/ci300400a
Article CAS PubMed Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V (2011) Scikit-learn: machine learning in python. J Mach Learn Res. https://doi.org/10.1145/2786984.2786995
Article Google Scholar
Sushko I, Salmina E, Potemkin VA, Poda G, Tetko IV (2012) ToxAlerts: a web server of structural alerts for toxic chemicals and compounds with potential adverse reactions. J Chem Inf Model 52(8):2310–2316. https://doi.org/10.1021/ci300245q
Article CAS PubMed PubMed Central Google Scholar
Fishbein L (2011) Potential industrial carcinogens and mutagens. Elsevier, Amsterdam. https://www.elsevier.com/books/potential-industrial-carcinogens-and-mutagens/fishbein/978-0-444-41777-0
Klopman G, Frierson MR, Rosenkranz HS (1990) The structural basis of the mutagenicity of chemicals in Salmonella typhimurium: the Gene-Tox Data Base. Mutat Res Fundam Mol Mech Mutagen 228(1):1–50. https://doi.org/10.1016/0027-5107(90)90013-T
Article CAS Google Scholar
Chesis L, Smith MT (1984) Mutagenicity of quinones: pathways of metabolic activation and detoxification. PNAS 81(6):1696–1700. https://doi.org/10.1073/pnas.81.6.1696
Article CAS PubMed Google Scholar
Ashby J, Tennant RW (1988) Chemical structure, Salmonella mutagenicity and extent of carcinogenicity as indicators of genotoxic carcinogenesis among 222 chemicals tested in rodents by the U.S. NCI/NTP 204(1):17–115. https://doi.org/10.1016/0165-1218(88)90114-0
Article CAS Google Scholar
Benigni R, Bossa C (2008) Structure alerts for carcinogenicity, and the Salmonella assay system: a novel insight through the chemical relational databases technology. Mutat Res Rev Mutat Res 659(3):248–261. https://doi.org/10.1016/j.mrrev.2008.05.003
Article CAS Google Scholar

Download references

Author information

Shimeng Li and Li Zhang have contributed equally to this work.

Authors and Affiliations

School of Life Science, Liaoning University, Shenyang, 110036, China
Shimeng Li, Li Zhang, Huawei Feng, Jinhui Meng, Di Xie & Hongsheng Liu
Research Center for Computer Simulating and Information Processing of Bio-Macromolecules of Liaoning Province, Shenyang, 110036, China
Li Zhang & Hongsheng Liu
Engineering Laboratory for Molecular Simulation and Designing of Drug Molecules of Liaoning, Shenyang, 110036, China
Li Zhang & Hongsheng Liu
School of Information, Liaoning University, Shenyang, 110036, China
Liwei Yi
Department of Biological Chemistry, The Hebrew University of Jerusalem, Edmond J. Safra Campus, Givat-Ram, 91904, Jerusalem, Israel
Isaiah T. Arkin

Authors

Shimeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Li Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Huawei Feng
View author publications
You can also search for this author in PubMed Google Scholar
Jinhui Meng
View author publications
You can also search for this author in PubMed Google Scholar
Di Xie
View author publications
You can also search for this author in PubMed Google Scholar
Liwei Yi
View author publications
You can also search for this author in PubMed Google Scholar
Isaiah T. Arkin
View author publications
You can also search for this author in PubMed Google Scholar
Hongsheng Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongsheng Liu.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Information (332 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, S., Zhang, L., Feng, H. et al. MutagenPred-GCNNs: A Graph Convolutional Neural Network-Based Classification Model for Mutagenicity Prediction with Data-Driven Molecular Fingerprints. Interdiscip Sci Comput Life Sci 13, 25–33 (2021). https://doi.org/10.1007/s12539-020-00407-2

Download citation

Received: 06 May 2020
Revised: 24 November 2020
Accepted: 03 December 2020
Published: 27 January 2021
Issue Date: March 2021
DOI: https://doi.org/10.1007/s12539-020-00407-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MutagenPred-GCNNs: A Graph Convolutional Neural Network-Based Classification Model for Mutagenicity Prediction with Data-Driven Molecular Fingerprints

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Artificial intelligence to deep learning: machine intelligence approach for drug discovery

References

Author information

Authors and Affiliations

Corresponding author

Supplementary Information

Supplementary Information (332 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

MutagenPred-GCNNs: A Graph Convolutional Neural Network-Based Classification Model for Mutagenicity Prediction with Data-Driven Molecular Fingerprints

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Artificial intelligence to deep learning: machine intelligence approach for drug discovery

References

Author information

Authors and Affiliations

Corresponding author

Supplementary Information

Supplementary Information (332 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation