Abstract
In molecular property prediction tasks, graph neural networks have become a widely used tool. Recently, self-supervised learning frameworks, especially contrastive learning, gathered growing attention for the potential to learn molecular representations that generalize to the meaningful chemical space. Unlike supervised, self-supervised learning can directly leverage extensive unlabeled data, which significantly reduces the effort to acquire molecular property labels through costly and time-consuming simulations or experiments. However, most of them do not take into account the unique cheminformatics (e.g., molecular fingerprints) and multi-level molecular graph structures (e.g., functional groups).
In toxicity prediction tasks the molecule substructure can be crucial. Structure alerts (e.g. toxicophores) are studied pretty well and proven to be responsible for different types of toxicity. In this work, we propose chemistry-wise augmentations for a contrastive learning framework. Two augmentations were implemented: (1) toxicophore subgraph removal, and (2) toxicophore subgraph saving. This approach does not violate chemical principles while pushing the model to learn the toxicity-dependent parts of a molecule.
Experiments showed that novel augmentations are more efficient than the random subgraph masking approach usually used in molecular contrastive learning. The performance comparison with other GNN-based frameworks is carried out as well.
I. Makarov—The work of Ilya Makarov was made in the framework of the strategic project “Digital Business” within the Strategic Academic Leadership Program “Priority 2030” at NUST MISiS.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Fang, Y., et al.: Molecular contrastive learning with chemical element knowledge graph. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 36 (4), pp. 3968–3976 (2022)
Gerasimova, O., Makarov, I.: Higher school of economics co-authorship network study. In: Proceedings of the 2nd IEEE International Conference on Computer Applications and Information Security (ICCAIS 2019), pp. 1–4. King Saud University, IEEE, New York (2019). https://doi.org/10.1109/CAIS.2019.8769556
Gilmer, J., Schoenholz, S.S., Riley, P.F., Vinyals, O., Dahl, G.E.: Neural message passing for quantum chemistry. In: International Conference on Machine Learning, pp. 1263–1272. PMLR (2017)
Grachev, A.M., Ignatov, D.I., Savchenko, A.V.: Neural networks compression for language modeling. In: Shankar, B.U., Ghosh, K., Mandal, D.P., Ray, S.S., Zhang, David, Pal, S.K. (eds.) PReMI 2017. LNCS, vol. 10597, pp. 351–357. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69900-4_44
Kim, S., et al.: PubChem 2019 update: improved access to chemical data. Nucleic Acids Res. 47(D1), D1102–D1109 (2019). https://doi.org/10.1093/nar/gky1033
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. CoRR abs/1609.02907, https://arxiv.org/abs/1609.02907 (2016)
Li, S., Zhou, J., Xu, T., Dou, D., Xiong, H.: GeomGCL: geometric graph contrastive learning for molecular property prediction. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36 (4), pp. 4541–4549 (2022)
Liu, S., Chandereng, T., Liang, Y.: N-gram graph, a novel molecule representation. CoRR abs/1806.09206, https://arxiv.org/abs/1806.09206 (2018)
Lu, C., Liu, Q., Wang, C., Huang, Z., Lin, P., He, L.: Molecular property prediction: a multilevel quantum interactions modeling perspective. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 1052–1060 (2019). https://doi.org/10.1609/aaai.v33i01.33011052
Makarov, I., Gerasimova, O.: Link prediction regression for weighted co-authorship networks. In: Proceedings of the 15th International Work-Conference on Artificial Neural Networks (IWANN 2019), pp. 667–677. Universitat Politecnica de Catalunya, Springer, Berlin (2019). https://doi.org/10.1007/978-3-030-20518-8_55
Makarov, I., Gerasimova, O.: Predicting collaborations in co-authorship network. In: Proceedings of the 14th IEEE International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP 2019), pp. 1–6. Cyprus University of Technology, IEEE, New York (2019). https://doi.org/10.1109/SMAP.2019.8864887
Makarov, I., Kiselev, D., Nikitinsky, N., Subelj, L.: Survey on graph embeddings and their applications to machine learning problems on graphs. PeerJ Comput. Sci. 7, e357 (2021). https://doi.org/10.7717/peerj-cs.357
Makarov, I., Korovina, K., Kiselev, D.: JONNEE: joint network nodes and edges embedding. IEEE Access 9, 144646–144659 (2021). https://doi.org/10.1109/ACCESS.2021.3122100
Makarov, I., Makarov, M., Kiselev, D.: Fusion of text and graph information for machine learning problems on networks. PeerJ Comput. Sci. 7(e526), 1–26 (2021). https://doi.org/10.7717/peerj-cs.526
Makarov, I., et al.: Temporal network embedding framework with causal anonymous walks representations. PeerJ Comput. Sci. 8(e858), 1–27 (2022). https://doi.org/10.7717/peerj-cs.858
Makarov, I., Savostyanov, D., Litvyakov, B., Ignatov, D.I.: Predicting winning team and probabilistic ratings in “dota 2” and “counter-strike: Global offensive” video games. In: Proceedings of the 6th International Conference on Analysis of Images, Social Networks and Texts (AIST 2017), pp. 183–196. LNCS, Polytechnic University, Springer, Berlin (2017). https://doi.org/10.1007/978-3-319-73013-4_17
Savchenko, A.V.: Fast inference in convolutional neural networks based on sequential three-way decisions. Inf. Sci. 560, 370–385 (2021)
Savchenko, A.V., Belova, N.S.: Statistical testing of segment homogeneity in classification of piecewise-regular objects. Int. J. Appl. Math. Comput. Sci. 25(4), 915–925 (2015)
Savchenko, A.V., Belova, N.S.: Unconstrained face identification using maximum likelihood of distances between deep off-the-shelf features. Expert Syst. Appl. 108, 170–182 (2018)
Savchenko, A.V., Savchenko, L.V.: Towards the creation of reliable voice control system based on a fuzzy approach. Pattern Recogn. Lett. 65, 145–151 (2015)
Schütt, K., Kindermans, P.J., Sauceda Felix, H.E., Chmiela, S., Tkatchenko, A., Müller, K.R.: SchNet: a continuous-filter convolutional neural network for modeling quantum interactions. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Singh, P.K., Negi, A., Gupta, P.K., Chauhan, M., Kumar, R.: Toxicophore exploration as a screening technology for drug design and discovery: techniques, scope and limitations. Arch. Toxicol. 90(8), 1785–1802 (2016). https://doi.org/10.1007/s00204-015-1587-5
Sun, F.Y., Hoffmann, J., Verma, V., Tang, J.: InfoGraph: unsupervised and Semi-supervised Graph-level Representation Learning Via Mutual Information Maximization. arXiv preprint arXiv:1908.01000 (2019)
Sun, M., Xing, J., Wang, H., Chen, B., Zhou, J.: MoCL: data-driven molecular fingerprint via knowledge-aware contrastive learning from molecular graph. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3585–3594. ACM, New York (2021). https://doi.org/10.1145/3447548.3467186
Tikhomirova, K., Makarov, I.: Community detection based on the nodes role in a network: the telegram platform case. In: Proceedings of the 9th International Conference on Analysis of Images, Social Networks and Texts (AIST 2020), pp. 294–302. LNCS, Skoltech, Springer, Berlin (2020). https://doi.org/10.1007/978-3-030-72610-2_22
Wang, X., Liu, N., Han, H., Shi, C.: Self-supervised heterogeneous graph neural network with co-contrastive learning. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 1726–1736 (2021)
Wang, Y., Wang, J., Cao, Z., Barati Farimani, A.: Molecular contrastive learning of representations via graph neural networks. Nature Mach. Intell. 4(3), 279–287 (2022). https://doi.org/10.1038/s42256-022-00447-x
Wieder, O., et al.: A compact review of molecular property prediction with graph neural networks. Drug Discov. Today Technol. 37, 1–12 (2020). https://doi.org/10.1016/j.ddtec.2020.11.009
Withnall, M., Lindelöf, E., Engkvist, O., Chen, H.: Building attention and edge message passing neural networks for bioactivity and physical-chemical property prediction. J. Cheminformatics 12(1), 1 (2020). https://doi.org/10.1186/s13321-019-0407-y
Xiong, Z., et al.: Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism. J. Med. Chem. 63(16), 8749–8760 (2020). https://doi.org/10.1021/acs.jmedchem.9b00959
Xu, K., Hu, W., Leskovec, J., Jegelka, S.: How powerful are graph neural networks?. https://arxiv.org/abs/1810.00826 (2018)
Yang, K., et al.: Analyzing learned molecular representations for property prediction. J. Chem. Inf. Model. 59(8), 3370–3388 (2019). https://doi.org/10.1021/acs.jcim.9b00237
Yuning, Y., Tianlong, Ch., Yongduo, S., Ting, C., Zhangyang, W., Shen, Y.: Graph contrastive learning with augmentations. In: Lin, H.L., Ranzato, M., Hadsell, R., Balcan, M.F., H. (eds.) Advances in Neural Information Processing Systems, pp. 5812–5823. Curran Associates, Inc. (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ondar, E., Makarov, I. (2023). Chemistry-Wise Augmentations for Molecule Graph Self-supervised Representation Learning. In: Rojas, I., Joya, G., Catala, A. (eds) Advances in Computational Intelligence. IWANN 2023. Lecture Notes in Computer Science, vol 14135. Springer, Cham. https://doi.org/10.1007/978-3-031-43078-7_27
Download citation
DOI: https://doi.org/10.1007/978-3-031-43078-7_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43077-0
Online ISBN: 978-3-031-43078-7
eBook Packages: Computer ScienceComputer Science (R0)