Chemistry-Wise Augmentations for Molecule Graph Self-supervised Representation Learning

Ondar, Evgeniia; Makarov, Ilya

doi:10.1007/978-3-031-43078-7_27

Evgeniia Ondar¹⁰ &
Ilya Makarov^11,12

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14135))

Included in the following conference series:

International Work-Conference on Artificial Neural Networks

427 Accesses

Abstract

In molecular property prediction tasks, graph neural networks have become a widely used tool. Recently, self-supervised learning frameworks, especially contrastive learning, gathered growing attention for the potential to learn molecular representations that generalize to the meaningful chemical space. Unlike supervised, self-supervised learning can directly leverage extensive unlabeled data, which significantly reduces the effort to acquire molecular property labels through costly and time-consuming simulations or experiments. However, most of them do not take into account the unique cheminformatics (e.g., molecular fingerprints) and multi-level molecular graph structures (e.g., functional groups).

In toxicity prediction tasks the molecule substructure can be crucial. Structure alerts (e.g. toxicophores) are studied pretty well and proven to be responsible for different types of toxicity. In this work, we propose chemistry-wise augmentations for a contrastive learning framework. Two augmentations were implemented: (1) toxicophore subgraph removal, and (2) toxicophore subgraph saving. This approach does not violate chemical principles while pushing the model to learn the toxicity-dependent parts of a molecule.

Experiments showed that novel augmentations are more efficient than the random subgraph masking approach usually used in molecular contrastive learning. The performance comparison with other GNN-based frameworks is carried out as well.

I. Makarov—The work of Ilya Makarov was made in the framework of the strategic project “Digital Business” within the Strategic Academic Leadership Program “Priority 2030” at NUST MISiS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Fang, Y., et al.: Molecular contrastive learning with chemical element knowledge graph. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 36 (4), pp. 3968–3976 (2022)
Google Scholar
Gerasimova, O., Makarov, I.: Higher school of economics co-authorship network study. In: Proceedings of the 2nd IEEE International Conference on Computer Applications and Information Security (ICCAIS 2019), pp. 1–4. King Saud University, IEEE, New York (2019). https://doi.org/10.1109/CAIS.2019.8769556
Gilmer, J., Schoenholz, S.S., Riley, P.F., Vinyals, O., Dahl, G.E.: Neural message passing for quantum chemistry. In: International Conference on Machine Learning, pp. 1263–1272. PMLR (2017)
Google Scholar
Grachev, A.M., Ignatov, D.I., Savchenko, A.V.: Neural networks compression for language modeling. In: Shankar, B.U., Ghosh, K., Mandal, D.P., Ray, S.S., Zhang, David, Pal, S.K. (eds.) PReMI 2017. LNCS, vol. 10597, pp. 351–357. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69900-4_44
Chapter Google Scholar
https://tripod.nih.gov/tox21/challenge/
Kim, S., et al.: PubChem 2019 update: improved access to chemical data. Nucleic Acids Res. 47(D1), D1102–D1109 (2019). https://doi.org/10.1093/nar/gky1033
Article PubMed Google Scholar
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. CoRR abs/1609.02907, https://arxiv.org/abs/1609.02907 (2016)
Li, S., Zhou, J., Xu, T., Dou, D., Xiong, H.: GeomGCL: geometric graph contrastive learning for molecular property prediction. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36 (4), pp. 4541–4549 (2022)
Google Scholar
Liu, S., Chandereng, T., Liang, Y.: N-gram graph, a novel molecule representation. CoRR abs/1806.09206, https://arxiv.org/abs/1806.09206 (2018)
Lu, C., Liu, Q., Wang, C., Huang, Z., Lin, P., He, L.: Molecular property prediction: a multilevel quantum interactions modeling perspective. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 1052–1060 (2019). https://doi.org/10.1609/aaai.v33i01.33011052
Makarov, I., Gerasimova, O.: Link prediction regression for weighted co-authorship networks. In: Proceedings of the 15th International Work-Conference on Artificial Neural Networks (IWANN 2019), pp. 667–677. Universitat Politecnica de Catalunya, Springer, Berlin (2019). https://doi.org/10.1007/978-3-030-20518-8_55
Makarov, I., Gerasimova, O.: Predicting collaborations in co-authorship network. In: Proceedings of the 14th IEEE International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP 2019), pp. 1–6. Cyprus University of Technology, IEEE, New York (2019). https://doi.org/10.1109/SMAP.2019.8864887
Makarov, I., Kiselev, D., Nikitinsky, N., Subelj, L.: Survey on graph embeddings and their applications to machine learning problems on graphs. PeerJ Comput. Sci. 7, e357 (2021). https://doi.org/10.7717/peerj-cs.357
Article PubMed PubMed Central Google Scholar
Makarov, I., Korovina, K., Kiselev, D.: JONNEE: joint network nodes and edges embedding. IEEE Access 9, 144646–144659 (2021). https://doi.org/10.1109/ACCESS.2021.3122100
Article Google Scholar
Makarov, I., Makarov, M., Kiselev, D.: Fusion of text and graph information for machine learning problems on networks. PeerJ Comput. Sci. 7(e526), 1–26 (2021). https://doi.org/10.7717/peerj-cs.526
Article Google Scholar
Makarov, I., et al.: Temporal network embedding framework with causal anonymous walks representations. PeerJ Comput. Sci. 8(e858), 1–27 (2022). https://doi.org/10.7717/peerj-cs.858
Article Google Scholar
Makarov, I., Savostyanov, D., Litvyakov, B., Ignatov, D.I.: Predicting winning team and probabilistic ratings in “dota 2” and “counter-strike: Global offensive” video games. In: Proceedings of the 6th International Conference on Analysis of Images, Social Networks and Texts (AIST 2017), pp. 183–196. LNCS, Polytechnic University, Springer, Berlin (2017). https://doi.org/10.1007/978-3-319-73013-4_17
Savchenko, A.V.: Fast inference in convolutional neural networks based on sequential three-way decisions. Inf. Sci. 560, 370–385 (2021)
Article Google Scholar
Savchenko, A.V., Belova, N.S.: Statistical testing of segment homogeneity in classification of piecewise-regular objects. Int. J. Appl. Math. Comput. Sci. 25(4), 915–925 (2015)
Article Google Scholar
Savchenko, A.V., Belova, N.S.: Unconstrained face identification using maximum likelihood of distances between deep off-the-shelf features. Expert Syst. Appl. 108, 170–182 (2018)
Article Google Scholar
Savchenko, A.V., Savchenko, L.V.: Towards the creation of reliable voice control system based on a fuzzy approach. Pattern Recogn. Lett. 65, 145–151 (2015)
Article Google Scholar
Schütt, K., Kindermans, P.J., Sauceda Felix, H.E., Chmiela, S., Tkatchenko, A., Müller, K.R.: SchNet: a continuous-filter convolutional neural network for modeling quantum interactions. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Singh, P.K., Negi, A., Gupta, P.K., Chauhan, M., Kumar, R.: Toxicophore exploration as a screening technology for drug design and discovery: techniques, scope and limitations. Arch. Toxicol. 90(8), 1785–1802 (2016). https://doi.org/10.1007/s00204-015-1587-5
Article CAS PubMed Google Scholar
Sun, F.Y., Hoffmann, J., Verma, V., Tang, J.: InfoGraph: unsupervised and Semi-supervised Graph-level Representation Learning Via Mutual Information Maximization. arXiv preprint arXiv:1908.01000 (2019)
Sun, M., Xing, J., Wang, H., Chen, B., Zhou, J.: MoCL: data-driven molecular fingerprint via knowledge-aware contrastive learning from molecular graph. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3585–3594. ACM, New York (2021). https://doi.org/10.1145/3447548.3467186
Tikhomirova, K., Makarov, I.: Community detection based on the nodes role in a network: the telegram platform case. In: Proceedings of the 9th International Conference on Analysis of Images, Social Networks and Texts (AIST 2020), pp. 294–302. LNCS, Skoltech, Springer, Berlin (2020). https://doi.org/10.1007/978-3-030-72610-2_22
Wang, X., Liu, N., Han, H., Shi, C.: Self-supervised heterogeneous graph neural network with co-contrastive learning. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 1726–1736 (2021)
Google Scholar
Wang, Y., Wang, J., Cao, Z., Barati Farimani, A.: Molecular contrastive learning of representations via graph neural networks. Nature Mach. Intell. 4(3), 279–287 (2022). https://doi.org/10.1038/s42256-022-00447-x
Article Google Scholar
Wieder, O., et al.: A compact review of molecular property prediction with graph neural networks. Drug Discov. Today Technol. 37, 1–12 (2020). https://doi.org/10.1016/j.ddtec.2020.11.009
Article PubMed Google Scholar
Withnall, M., Lindelöf, E., Engkvist, O., Chen, H.: Building attention and edge message passing neural networks for bioactivity and physical-chemical property prediction. J. Cheminformatics 12(1), 1 (2020). https://doi.org/10.1186/s13321-019-0407-y
Article CAS Google Scholar
Xiong, Z., et al.: Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism. J. Med. Chem. 63(16), 8749–8760 (2020). https://doi.org/10.1021/acs.jmedchem.9b00959
Article CAS PubMed Google Scholar
Xu, K., Hu, W., Leskovec, J., Jegelka, S.: How powerful are graph neural networks?. https://arxiv.org/abs/1810.00826 (2018)
Yang, K., et al.: Analyzing learned molecular representations for property prediction. J. Chem. Inf. Model. 59(8), 3370–3388 (2019). https://doi.org/10.1021/acs.jcim.9b00237
Article CAS PubMed PubMed Central Google Scholar
Yuning, Y., Tianlong, Ch., Yongduo, S., Ting, C., Zhangyang, W., Shen, Y.: Graph contrastive learning with augmentations. In: Lin, H.L., Ranzato, M., Hadsell, R., Balcan, M.F., H. (eds.) Advances in Neural Information Processing Systems, pp. 5812–5823. Curran Associates, Inc. (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Zelinsky Institute of Organic Chemistry, Moscow, Russia
Evgeniia Ondar
Artificial Intelligence Research Institute (AIRI), Moscow, Russia
Ilya Makarov
AI Center, NUST MISiS, Moscow, Russia
Ilya Makarov

Authors

Evgeniia Ondar
View author publications
You can also search for this author in PubMed Google Scholar
Ilya Makarov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ilya Makarov .

Editor information

Editors and Affiliations

University of Granada, Granada, Spain
Ignacio Rojas
University of Malaga, Málaga, Spain
Gonzalo Joya
Polytechnic University of Catalonia, Vilanova i la Geltrú, Spain
Andreu Catala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ondar, E., Makarov, I. (2023). Chemistry-Wise Augmentations for Molecule Graph Self-supervised Representation Learning. In: Rojas, I., Joya, G., Catala, A. (eds) Advances in Computational Intelligence. IWANN 2023. Lecture Notes in Computer Science, vol 14135. Springer, Cham. https://doi.org/10.1007/978-3-031-43078-7_27

Download citation

DOI: https://doi.org/10.1007/978-3-031-43078-7_27
Published: 01 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43077-0
Online ISBN: 978-3-031-43078-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Chemistry-Wise Augmentations for Molecule Graph Self-supervised Representation Learning