Abstract
In this paper, support vector machine and condensed graph of reaction (CGR) approaches have been used to predict the regioselectivity of aromatic hydroxylation for human CYP1A2 substrates. Experimental data on aromatic hydroxylation for human cytochrome CYP1A2 (observed molecular or “real” transformations) used in the modeling were extracted from the Metabolite database and the XenoSite database. In addition, all potential but unobserved (“unreal”) transformations were generated. The dataset containing “real” and “unreal” transformations was converted into an ensemble of CGRs representing pseudomolecules with conventional (single, double, aromatic, etc.) bonds and dynamic bonds characterizing chemical transformations. ISIDA fragment descriptors generated for CGRs were used for the modeling. The models have been validated in three times repeated fivefold cross-validation on the training set and then on an external set. The final model was constructed by consensus over models built on different descriptors sets. Predictive performance of our model on the external test set was similar to that of XenoSite and Way2Drug tools. Unlike previously used atom labeling-based approaches, the proposed CGR-based representation of metabolic transformations could be applied to different types of reactions catalyzed by the same enzyme and therefore, it is more suitable for automatized handling of metabolic data.
Similar content being viewed by others
References
Ekins, S., Nikolsky, Y., & Nikolskaya, T. (2005). Techniques: application of systems biology to absorption, distribution, metabolism, excretion and toxicity. Trends in Pharmacological Sciences, 26(4), 202–209. https://doi.org/10.1016/j.tips.2005.02.006.
Göller, A. H., Lang, D., Kunze, J., Testa, B., Wilson, I. D., Glen, R. C., & Schneider, G. (2015). Predicting drug metabolism: experiment and/or computation? Nature Reviews. Drug Discovery, 14(6), 387–404. https://doi.org/10.1038/nrd4581.
Crivori, P., & Poggesi, I. (2006). Computational approaches for predicting CYP-related metabolism properties in the screening of new drugs. European Journal of Medicinal Chemistry, 41(7), 795–808. https://doi.org/10.1016/j.ejmech.2006.03.003.
Jung, J., Kim, N. D., Kim, S. Y., et al. (2008). Regioselectivity prediction of CYP1A2-mediated phase I metabolism. Journal of Chemical Information and Modeling, 48(5), 1074–1080. https://doi.org/10.1021/ci800001m.
Cruciani, G., Carosati, E., et al. (2005). MetaSite: understanding metabolism in human cytochromes from the perspective of the chemist. Journal of Medicinal Chemistry, 48(22), 6970–6979. https://doi.org/10.1021/jm050529c.
Zamora, I., Afzelius, L., & Cruciani, G. (2003). Predicting drug metabolism: a site of metabolism prediction tool applied to the cytochrome P450 2C9. Journal of Medicinal Chemistry, 46(12), 2313–2324. https://doi.org/10.1021/jm021104i.
de Groot, M. J., Ackland, M. J., Horne, V. A., Alex, A. A., & Jones, B. C. (1999). A novel approach to predicting P450 mediated drug metabolism. CYP2D6 catalyzed N-dealkylation reactions and qualitative metabolite predictions using a combined protein and pharmacophore model for CYP2D6. Journal of Medicinal Chemistry, 42(20), 4062–4070.
de Groot, M. J., Ackland, M. J., Horne, V. A., Alex, A. A., & Jones, B. C. (1999). Novel approach to predicting P450-mediated drug metabolism: development of a combined protein and pharmacophore model for CYP2D6. Journal of Medicinal Chemistry, 42(9), 1515–1524. https://doi.org/10.1021/jm981118h.
Borodina, Y., Rudik, A., Filimonov, D., Kharchevnikova, N., Dmitriev, A., Blinova, V., & Poroikov, V. (2004). A new statistical approach to predicting aromatic hydroxylation sites. Comparison with model-based approaches. Journal of Chemical Information and Computer Sciences, 44(6), 1998–2009. https://doi.org/10.1021/ci049834h9.
Funatsu, K., Hasegawa, K., & Koyama, M. (2010). Quantitative prediction of regioselectivity toward cytochrome P450/3A4 using machine learning approaches. Molecular Informatics, 29(3), 243–249. https://doi.org/10.1002/minf.200900086.
Singh, S. B., Shen, L. Q., Walker, M. J., & Sheridan, R. P. (2003). A model for predicting likely sites of CYP3A4-mediated metabolism on drug-like molecules. Journal of Medicinal Chemistry, 46(8), 1330–1336. https://doi.org/10.1021/jm020400s.
Haji-Momenian, S., Rieger, J. M., Macdonald, T. M., & Brown, M. L. (2003). Comparative molecular field analysis and QSAR on substrates binding to cytochrome p450 2D6. Bioorganic & Medicinal Chemistry, 11(24), 5545–5554. https://doi.org/10.1016/S0968-0896(03)00525-X.
Hennemann, M., Friedl, A., Lobell, M., Keldenich, J., Hillisch, A., Clark, T., & Göller, A. H. (2009). CypScore: Quantitative prediction of reactivity toward cytochromes P450 based on semiempirical molecular orbital theory. ChemMedChem, 4(4), 657–669. https://doi.org/10.1002/cmdc.200800384.
Zheng, M., Luo, X., Shen, Q., Wang, Y., Du, Y., Zhu, W., & Jiang, H. (2009). Site of metabolism prediction for six biotransformations mediated by cytochromes P450. Bioinformatics, 25(10), 1251–1258. https://doi.org/10.1093/bioinformatics/btp140.
Boyer, S., Arnby, C. H., Carlsson, L., & Smith, J. (2007). Reaction site mapping of xenobiotic biotransformations. Journal of Chemical Information and Modeling, 47(2), 583–590. https://doi.org/10.1021/ci600376q.
Sheridan, R. P., Korzekwa, K. R., Torres, R. A., & Walker, M. J. (2007). Empirical regioselectivity models for human cytochromes P450 3A4, 2D6, and 2C9. Journal of Medicinal Chemistry, 50(14), 3173–3184. https://doi.org/10.1021/jm0613471.
Zaretzki, J., Matlock, M., & Swamidass, S. J. (2013). XenoSite: accurately predicting CYP-mediated sites of metabolism with neural networks. Journal of Chemical Information and Modeling, 53, 3373–3383. https://doi.org/10.1021/ci400518g.
Hughes, T. B., Miller, G. P., & Swamidass, S. J. (2015). Modeling epoxidation of drug-like molecules with a deep machine learning network. ACS Central Science, 1, 168–180. https://doi.org/10.1021/acscentsci.5b00131.
Dang, N. L., Hughes, T. B., & Swamidass, S. J. (2016). A simple model predicts UGT-mediated metabolism. Bioinformatics, 32, 3183–3189. https://doi.org/10.1093/bioinformatics/btw350.
Rudik, A. V., Dmitriev, A. V., Lagunin, A. A., Filimonov, D. A., & Poroikov, V. V. (2016). Prediction of reacting atoms for the major biotransformation reactions of organic xenobiotics. Journal of Cheminformatics, 8, 68. https://doi.org/10.1186/s13321-016-0183-x.
Rudik, A. V., Dmitriev, A. V., Lagunin, A. A., Filimonov, D. A., & Poroikov, V. V. (2014). Metabolism site prediction based on xenobiotic structural formulas and PASS prediction algorithm. Journal of Chemical Information and Modeling, 54(2), 498–507. https://doi.org/10.1021/ci400472j.
Rudik, A. V., Dmitriev, A. V., Lagunin, A. A., Filimonov, D. A., & Poroikov, V. V. (2015). SOMP: web server for in silico prediction of sites of metabolism for drug-like compounds. Bioinformatics, 31(12), 2046–2048. https://doi.org/10.1093/bioinformatics/btv087.
Accelrys, Inc. (2009) Accelrys Metabolite, San Diego. http://accelrys.com .
JChem 16.4.18, 2016, ChemAxon. http://www.chemaxon.com.
Varnek, A., Fourches, D., Hoonakker, F., & Solov’ev, V. P. (2005). Substructural fragments: an universal language to encode reactions, molecular and supramolecular structures. Journal of Computer-Aided Molecular Design, 19(9–10), 693–703. https://doi.org/10.1007/s10822-005-9008-0.
Nugmanov, R. I., Madzhidov, T. I., Khaliullina, G. R., Baskin, I. I., Antipin, I. S., & Varnek, A. A. (2014). Development of “structure-property” models in nucleophilic substitution reactions involving azides. Journal of Structural Chemistry, 55, 1026–1032. https://doi.org/10.1134/S0022476614060043.
Madzhidov, T. I., Bodrov, A. V., Gimadiev, T. R., Nugmanov, R. I., Antipin, I. S., & Varnek, A. A. (2015). Structure–reactivity relationship in bimolecular elimination reactions based on the condensed graph of a reaction. Journal of Structural Chemistry, 56, 1227–1234. https://doi.org/10.1134/S002247661507001X.
Madzhidov, T. I., Polishchuk, P. G., Nugmanov, R. I., Bodrov, A. V., Lin, A. I., Baskin, I. I., Varnek, A. A., & Antipin, I. S. (2014). Structure-reactivity relationships in terms of the condensed graphs of reactions. Russian Journal of Organic Chemistry, 50, 459–463. https://doi.org/10.1134/S1070428014040010.
Polishchuk, P., Madzhidov, T., Gimadiev, T., Bodrov, A., Nugmanov, R., & Varnek, A. (2017). Structure–reactivity modeling using mixture-based representation of chemical reactions. Journal of Computer-Aided Molecular Design, 31(9), 829–839. https://doi.org/10.1007/s10822-017-0044-3.
Hoonakker, F., Lachiche, N., Varnek, A., & Wagner, A. (2011). Condensed Graph of Reaction: considering a chemical reaction as one single pseudo molecule. International Journal on Artificial Intelligence Tools, 20(2), 253–270.
Marcou, G., de Sousa, J. A., Latino, D. A. R. S., de Luca, A., Horvath, D., Rietsch, V., & Varnek, A. (2015). Expert system for predicting reaction conditions: the Michael reaction case. Journal of Chemical Information and Modeling, 55(2), 239–250. https://doi.org/10.1021/ci500698a.
Lin, A. I., Madzhidov, T. I., Klimchuk, O., Nugmanov, R. I., Antipin, I. S., & Varnek, A. (2016). Automatized assessment of protective group reactivity: a step toward big reaction data analysis. Journal of Chemical Information and Modeling, 56, 2140–2148. https://doi.org/10.1021/acs.jcim.6b00319.
de Luca, A., Horvath, D., Marcou, G., Solov’ev, V., & Varnek, A. (2012). Mining chemical reactions using neighborhood behavior and condensed graphs of reactions approaches. Journal of Chemical Information and Modeling, 52(9), 2325–2338. https://doi.org/10.1021/ci300149n.
Horvath, D., Marcou, G., Varnek, A., Kayastha, S., de la Vega de León, A., & Bajorath, J. (2016). Prediction of activity cliffs using condensed graphs of reaction representations, descriptor recombination, support vector machine classification, and support vector regression. Journal of Chemical Information and Modeling, 56(9), 1631–1640. https://doi.org/10.1021/acs.jcim.6b00359.
Muller, C., Marcou, G., Horvath, D., Aires-de-Sousa, J., & Varnek, A. (2012). Models for identification of erroneous atom-to-atom mapping of reactions performed by automated algorithms. Journal of Chemical Information and Modeling, 52(12), 3116–3122. https://doi.org/10.1021/ci300418q.
Chen, W. L., Chen, D. Z., & Taylor, K. T. (2013). Automatic reaction mapping and reaction center detection. Wiley Interdisciplinary Reviews: Computational Molecular Science, 3, 560–593. https://doi.org/10.1002/wcms.1140.
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018.
Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines. https://www.csie.ntu.edu.tw/~cjlin/papers/libsvm Accessed 19 October 2017
Horvath, D., Brown, J., Marcou, G., & Varnek, A. (2014). An evolutionary optimizer of libsvm models. Challenges, 5, 450–472. https://doi.org/10.3390/challe5020450.
Wu, T., Lin, C., & Weng, R. (2004). Probability estimates for multi-class classification by pairwise coupling. Journal of Machine Learning Research, 5, 975–1005.
Filimonov, D., & Poroikov, V. (2008). Probabilistic approaches in activity prediction. In A. Varnek & A. Tropsha (Eds.), Chemoinformatics approaches to virtual screening (pp. 182–217). Cambridge: RSC Publishing.
Acknowledgements
We thank Prof. Vladimir Poroikov for providing us with the experimental data set and useful discussion. ChemAxon is acknowledged for the software tools used in this study for data storage and standardization. The study was supported by Russian Science Foundation (Contract 14-43-00024).
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Madzhidov, T.I., Khakimova, A.A., Nugmanov, R.I. et al. Prediction of Aromatic Hydroxylation Sites for Human CYP1A2 Substrates Using Condensed Graph of Reactions. BioNanoSci. 8, 384–389 (2018). https://doi.org/10.1007/s12668-017-0499-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12668-017-0499-7