Abstract
Multi-task learning (MTL) is a machine learning paradigm that aims to enhance the generalization of predictive models by leveraging shared information across multiple tasks. The recent breakthroughs achieved by deep neural network models in various domains have sparked hope for similar advances in the chemical sciences. In this Perspective, we provide insights into the current state and future potential of neural MTL models applied to computer-assisted drug design. In the context of drug discovery, one prominent application of MTL is protein–ligand binding affinity prediction, in which individual proteins are considered tasks. Here we introduce the fundamental principles of MTL and propose a framework for categorizing MTL models on the basis of their architecture. This framework enables us to present a comprehensive overview and comparison of a selection of MTL models that have been successfully utilized in drug design. Subsequently, we delve into the current challenges associated with the applications of MTL. One of the key challenges lies in defining suitable representations of the molecular entities under investigation and the respective machine learning tasks.
Similar content being viewed by others
References
Kirkpatrick, P. & Ellis, C. Chemical space. Nature 432, 823–823 (2004).
Reymond, J.-L. The chemical space project. Acc. Chem. Res. 48, 722–730 (2015).
Ertl, P. Cheminformatics analysis of organic substituents: identification of the most common substituents, calculation of substituent properties, and automatic identification of drug-like bioisosteric groups. J. Chem. Inf. Comput. Sci. 43, 374–380 (2003).
Bohacek, R. S., McMartin, C. & Guida, W. C. The art and practice of structure-based drug design: a molecular modeling perspective. Med. Res. Rev. 16, 3–50 (1996).
Lipinski, C. A., Lombardo, F., Dominy, B. W. & Feeney, P. J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Deliv. Rev. 23, 3–25 (1997).
Schneider, G. De novo Molecular Design (John Wiley and Sons, 2013).
Sadybekov, A. V. & Katritch, V. Computational approaches streamlining drug discovery. Nature 616, 673–685 (2023).
Schneider, P. et al. Rethinking drug design in the artificial intelligence era. Nat. Rev. Drug Discov. 19, 353–364 (2019).
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Schmidhuber, J. Deep learning in neural networks: an overview. Neural Networks 61, 85–117 (2015).
Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1106–1114 (2012).
Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Jiménez-Luna, J., Grisoni, F., Weskamp, N. & Schneider, G. Artificial intelligence in drug discovery: recent advances and future perspectives. Expert Opin. Drug Discov. 16, 949–959 (2021).
Reker, D., Schneider, P., Schneider, G. & Brown, J. Active learning for computational chemogenomics. Future Med. Chem. 9, 381–402 (2017).
Caruana, R. Multitask learning. Mach. Learn. 28, 41–75 (1997).
Zhang, Y. & Yang, Q. An overview of multi-task learning. Natl Sci. Rev. 5, 30–43 (2017).
Borchani, H., Varando, G., Bielza, C. & Larrañaga, P. A survey on multi-output regression. WIREs Data Min. Knowl. Discovery 5, 216–233 (2015).
Waegeman, W., DembczyÅ„ski, K. & Hüllermeier, E. Multi-target prediction: A unifying view on problems and methods. Data Min. Knowl. Discov. 33, 293–324 (2019).
Xu, Y., Ma, J., Liaw, A., Sheridan, R. & Svetnik, V. Demystifying multi-task deep neural networks for quantitative structure-activity relationships. J. Chem. Inf. Model. 57, 2490–2504 (2017).
Thrun, S. & Pratt, L. Learning to Learn: Introduction and Overview (Springer, 1998).
Hospedales, T., Antoniou, A., Micaelli, P. & Storkey, A. Meta-learning in neural networks: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 44, 5149–5169 (2022).
Bayoudh, K., Knani, R., Hamdaoui, F. & Abdellatif, M. A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets. Visual Comput. 38, 2939–2970 (2022).
Stahlschmidt, S. R., Ulfenborg, B. & Synnergren, J. Multimodal deep learning for biomedical data fusion: a review. Briefings Bioinf. 23, bbab569 (2022).
Kline, A. et al. Multimodal machine learning in precision health: a scoping review. npj Digit. Med. 5, 171 (2022).
Tang, X. et al. Explainable multi-task learning for multi-modality biological data analysis. Nat. Commun. 14, 2546 (2023).
Rosenbaum, L., Dörr, A., Bauer, M., Boeckler, F. & Zell, A. Inferring multi-target QSAR models with taxonomy-based multi-task learning. J. Cheminform. 5, 33 (2013).
Erhan, D., L’Heureux, P.-J., Yue, S. & Bengio, Y. Collaborative filtering on a family of biological targets. J. Chem. Inf. Model. 46, 626–35 (2006).
Jin, B. et al. Multitask dyadic prediction and its application in prediction of adverse drug-drug interaction. In Conference on Artificial Intelligence Vol. 31, 1367–1373 (AAAI Press, 2017).
Li, R. et al. Inductive matrix completion for predicting adverse drug reactions (ADRs) integrating drug-target interactions. Chemom. Intell. Lab. Syst. 144, 71–79 (2015).
Simm, J. et al. Macau: Scalable Bayesian factorization with high-dimensional side information using MCMC. In International Workshop on Machine Learning for Signal Processing (IEEE, 2017).
McCabe, P. G., Ortega-Martorell, S. & Olier, I. Benchmarking multi-task learning in predictive models for drug discovery. 2019 International Joint Conference on Neural Networks 1–7 (IEEE, 2019).
Pliakos, K., Vens, C. & Tsoumakas, G. Predicting drug-target interactions with multi-label classification and label partitioning. IEEE/ACM Trans. Comput. Biol. Bioinform. 18, 1596–1607 (2021).
Simões, R. S., Maltarollo, V. G., Oliveira, P. R. & Honorio, K. M. Transfer and multi-task learning in QSAR modeling: advances and challenges. Front. Pharmacol. 9, 74 (2018).
Sosnin, S. et al. A survey of multi-task learning methods in chemoinformatics. Mol. Inf. 38, 1800108 (2019).
Dahl, G., Jaitly, N. & Salakhutdinov, R. Multi-task neural networks for QSAR predictions. Preprint at https://doi.org/10.48550/arXiv.1406.1231 (2014).
Ma, J., Sheridan, R., Liaw, A., Dahl, G. & Svetnik, V. Deep neural nets as a method for quantitative structure-activity relationships. J. Chem. Inf. Model. 55, 263–274 (2015).
Ramsundar, B. et al. Is multitask deep learning practical for pharma? J. Chem. Inf. Model. 57, 2068–2076 (2017).
Unterthiner, T. et al. Deep learning as an opportunity in virtual screening. In Deep Learning and Representation Learning Workshop, NIPS (2014).
Ramsundar, B. et al. Massively multitask networks for drug discovery. Preprint at https://doi.org/10.48550/arXiv.1502.02072 (2015).
Rogers, D. & Hahn, M. Extended-connectivity fingerprints. J. Chem. Inf. Model. 50, 742–754 (2010).
Rosenblatt, F. The Perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65, 386–408 (1958).
Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning internal representations by error propagation. In Parallel Distributed Processing: Explorations in the Microstructure of Cognition 318–362 (MIT Press, 1986).
Öztürk, H., Özgür, A. & Ozkirimli, E. DeepDTA: deep drug–target binding affinity prediction. Bioinformatics 34, i821–i829 (2018).
Nguyen, T. et al. GraphDTA: predicting drug-target binding affinity with graph neural networks. Bioinformatics 37, 1140–1147 (2021).
Yang, Z., Zhong, W., Zhao, L. & Yu-Chian Chen, C. MGraphDTA: deep multiscale graph neural network for explainable drug-target binding affinity prediction. Chem. Sci. 13, 816–833 (2022).
Liu, Z. et al. Docking-based virtual screening with multi-task learning. In 2021 IEEE International Conference on Bioinformatics and Biomedicine 381–385 (IEEE, 2021).
Tsubaki, M., Tomii, K. & Sese, J. Compound-protein interaction prediction with end-to-end learning of neural networks for graphs and sequences. Bioinformatics 35, 309–318 (2019).
Withnall, M., Lindelöf, E., Engkvist, O. & Chen, H. Building attention and edge message passing neural networks for bioactivity and physical-chemical property prediction. J. Cheminform. 12, 1 (2020).
Ragoza, M., Hochuli, J., Idrobo, E., Sunseri, J. & Koes, D. R. Protein–ligand scoring with convolutional neural networks. J. Chem. Inf. Model. 57, 942–957 (2017).
Jones, D. et al. Improved protein–ligand binding affinity prediction with structure-based deep fusion inference. J. Chem. Inf. Model. 61, 1583–1592 (2021).
Martínez Mora, A., Subramanian, V. & Miljković, F. Multi-task convolutional neural networks for predicting in vitro clearance endpoints from molecular images. J. Comput. Aided Mol. Des. 36, 443–457 (2022).
Wu, Z. et al. MoleculeNet: a benchmark for molecular machine learning. Chem. Sci. 9, 513–530 (2018).
Hughes, T., Dang, N., Miller, G. & Swamidass, S. J. Modeling reactivity to biological macromolecules with a deep multitask network. ACS Cent. Sci. 2, 529–537 (2017).
Xiong, Z. et al. Pushing the boundaries of molecular representation for drug discovery with graph attention mechanism. J. Med. Chem. 63, 8749–8760 (2019).
Öztürk, H., Olmez, E. O. & Özgür, A. WideDTA: Prediction of drug-target binding affinity. Preprint at https://doi.org/10.48550/arXiv.1902.04166 (2019).
Bao, L. et al. Kinome-wide polypharmacology profiling of small molecules by multi-task graph isomorphism network approach. Acta Pharm. Sin. B 13, 54–67 (2022).
Quan, Z. et al. A system for learning atoms based on long short-term memory recurrent neural networks. In 2018 IEEE International Conference on Bioinformatics and Biomedicine 728–733 (IEEE, 2018).
Asgari, E. & Mofrad, M. R. K. Continuous distributed representation of biological sequences for deep proteomics and genomics. PLoS ONE 10, e0141287 (2015).
Mikolov, T., Chen, K., Corrado, G. & Dean, J. Efficient estimation of word representations in vector space. In International Conference on Learning Representations (ICLR, 2013).
Lin, X. et al. DeepGS: deep representation learning of graphs and sequences for drug-target binding affinity prediction. In 24th European Conference on Artificial Intelligence Vol. 325, 1301–1308 (European Conference on Artificial Intelligence, 2020).
Tian, Q. et al. Predicting drug-target affinity based on recurrent neural networks and graph convolutional neural networks. Comb. Chem. High Throughput Screening 25, 634–641 (2022).
Abbasi, K. et al. DeepCDA: deep cross-domain compound-protein affinity prediction through LSTM and convolutional neural networks. Bioinformatics 36, 4633–4642 (2020).
Tran, H. N. T., Thomas, J. J. & Malim, N. H. A. H. DeepNC: a framework for drug-target interaction prediction with graph neural networks. PeerJ 10, e13163 (2022).
Zhu, J. et al. DAEM: deep attributed embedding based multi-task learning for predicting adverse drug-drug interaction. Expert Syst. Appl. 215, 119312 (2023).
Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28, 31–36 (1988).
O’Boyle, N. M. & Dalke, A. DeepSMILES: an adaptation of SMILES for use in machine-learning of chemical structures. Preprint at https://doi.org/10.26434/chemrxiv.7097960.v1 (2018).
Lin, S., Shi, C. & Chen, J. GeneralizedDTA: combining pre-training and multi-task learning to predict drug-target binding affinity for unknown drug discovery. BMC Bioinf. 23, 367 (2022).
Mauri, A., Consonni, V. & Todeschini, R. in Handbook of Computational Chemistry (Springer, 2016).
David, L., Thakkar, A., Mercado, R. & Engkvist, O. Molecular representations in AI-driven drug discovery: a review and practical guide. J. Cheminform. 12, 56 (2020).
Jiménez-Luna, J., Škalič, M., Martínez-Rosell, G. & Fabritiis, G. KDEEP: protein-ligand absolute binding affinity prediction via 3D-convolutional neural networks. J. Chem. Inf. Model. 58, 287–296 (2018).
Feinberg, E. et al. PotentialNet for molecular property prediction. ACS Cent. Sci. 4, 1520–1530 (2018).
Schmitt, S., Kuhn, D. & Klebe, G. A new method to detect related function among proteins independent of sequence or fold homology. J. Mol. Biol. 323, 387–406 (2002).
Volkov, M. et al. On the frustration to predict binding affinities from protein-ligand structures with deep neural networks. J. Med. Chem. 65, 7946–7958 (2022).
Krasoulis, A., Antonopoulos, N., Pitsikalis, V. & Theodorakis, S. DENVIS: scalable and high-throughput virtual screening using graph neural networks with atomic and surface protein pocket features. J. Chem. Inf. Model. 62, 4642–4659 (2022).
Wang, D. D., Chan, M.-T. & Yan, H. Structure-based protein–ligand interaction fingerprints for binding affinity prediction. Comput. Structural Biotechnol. J. 19, 6291–6300 (2021).
Deng, Z., Chuaqui, C. & Singh, J. Structural Interaction Fingerprint (SIFt): a novel method for analyzing three-dimensional protein-ligand binding interactions. J. Med. Chem. 47, 337–344 (2004).
Ballester, P. J. & Mitchell, J. B. O. A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking. Bioinformatics 26, 1169–1175 (2010).
Carhart, R. E., Smith, D. H. & Venkataraghavan, R. Atom pairs as molecular features in structure-activity studies: definition and applications. J. Chem. Inf. Comput. Sci. 25, 64–73 (1985).
Schneider, G., Neidhart, W., Giller, T. & Schmid, G. ‘Scaffold-hopping’ by topological pharmacophore search: a contribution to virtual screening. Angew. Chgem. Int. Ed. 38, 2894–2896 (1999).
Zheng, L., Fan, J. & Mu, Y. OnionNet: a multiple-layer intermolecular-contact-based convolutional neural network for protein–ligand binding affinity prediction. ACS Omega 4, 15956–15965 (2019).
Wang, Z. et al. OnionNet-2: a convolutional neural network model for predicting protein-ligand binding affinity based on residue-atom contacting shells. Front. Chem. 9, 753002 (2021).
Hu, B., Wang, H., Wang, L. & Yuan, W. Adverse drug reaction predictions using stacking deep heterogeneous information network embedding approach. Molecules 23, 3193 (2018).
Ma, T., Xiao, C., Zhou, J. & Wang, F. Drug similarity integration through attentive multi-view graph auto-encoders. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) 3477–3483 (AAAI Press, 2018).
Liu, S., Qu, M., Zhang, Z., Cai, H. & Tang, J. Structured multi-task learning for molecular property prediction. In Proceedings of the 25th International Conference on Artificial Intelligence and Statistics (AISTATS) Vol. 151, 8906–8920 (IEEE, 2022).
Jacot, A., Gabriel, F. & Hongler, C. Neural tangent kernel: convergence and generalization in neural networks. Adv. Neural Inf. Process. Syst. 31, 8580–8589 (2018).
Radhakrishnan, A., Stefanakis, G., Belkin, M. & Uhler, C. Simple, fast, and flexible framework for matrix completion with infinite width neural networks. Proc. Natl Acad. Sci. USA 119, e2115064119 (2022).
Kramer, M. Autoassociative neural networks. Comput. Chem. Eng. 16, 313–328 (1992).
LeCun, Y. & Bengio, Y. in The Handbook of Brain Theory and Neural Networks 255–258 (MIT Press, 1998).
Weng, Y., Lin, C., Zeng, X. & Liang, Y. Drug target interaction prediction using multi-task learning and co-attention. In 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 528–533 (IEEE, 2019).
Cho, K. et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Conference on Empirical Methods in Natural Language Processing 1724–1734 (Association for Computational Linguistics, 2014).
Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).
Duvenaud, D. et al. Convolutional networks on graphs for learning molecular fingerprints. Adv. Neural Inf. Process. Syst. 28, 2224–2232 (2015).
Kearnes, S., McCloskey, K., Berndl, M., Pande, V. & Riley, P. Molecular graph convolutions: Moving beyond fingerprints. J. Comput. Aided Mol. Des. 30, 595–608 (2016).
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. In Proceedings of the 34th International Conference on Machine Learning Vol. 70, 1263–1272 (PMLR, 2017).
Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations (ICLR, 2017).
Veličković, P. et al. Graph attention networks. In International Conference on Learning Representations (ICLR, 2018).
Xu, K., Hu, W., Leskovec, J. & Jegelka, S. How powerful are graph neural networks? In International Conference on Learning Representations (ICLR, 2019).
Li, G., Xiong, C., Thabet, A. & Ghanem, B. DeeperGCN: training deeper GCNs with generalized aggregation functions. In International Conference on Learning Representations 13024–13034 (ICLR, 2021).
Bai, S., Zhang, F. & Torr, P. H. S. Hypergraph convolution and hypergraph attention. Pattern Recognit. 110, 107637 (2021).
Bronstein, M. M., Bruna, J., Cohen, T. & Veličković, P. Geometric deep learning: grids, groups, graphs, geodesics, and gauges. Preprint at https://doi.org/10.48550/arXiv.2104.13478 (2021).
Gomes, J., Ramsundar, B., Feinberg, E. N. & Pande, V. S. Atomic convolutional networks for predicting protein-ligand binding affinity. Preprint at https://doi.org/10.48550/arXiv.1703.10603 (2017).
Monti, F. et al. Geometric deep learning on graphs and manifolds using mixture model CNNs. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 5425–5434 (IEEE, 2017).
Atz, K., Grisoni, F. & Schneider, G. Geometric deep learning on molecular representations. Nat. Mach. Intell. 3, 1023–1032 (2021).
Isert, C., Atz, K. & Schneider, G. Structure-based drug design with geometric deep learning. Curr. Opin. Struct. Biol. 79, 102548 (2023).
LeCun, Y., Chopra, S. & Hadsell, R. in Predicting Structured Data (MIT Press, 2006).
Kendall, A., Gal, Y. & Cipolla, R. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 7482–7491 (IEEE, 2018).
Sevilla, J. et al. Compute trends across three eras of machine learning. In 2022 International Joint Conference on Neural Networks (IJCNN) 1–8 (IEEE, 2022).
Brampton, C. K. Nominalism and the law of parsimony. Mod. Schoolman 41, 273–281 (1964).
Bubeck, S. & Sellke, M. A universal law of robustness via isoperimetry. J. ACM 70, 10 (2023).
Schweidtmann, A. M. et al. Physical pooling functions in graph neural networks for molecular property prediction. Comput. Chem. Eng. 172, 108202 (2023).
Hirschfeld, L., Swanson, K., Yang, K., Barzilay, R. & Coley, C. W. Uncertainty quantification using neural networks for molecular property prediction. J. Chem. Inf. Model. 60, 3770–3780 (2020).
Jiménez-Luna, J., Grisoni, F. & Schneider, G. Drug discovery with explainable artificial intelligence. Nat. Mach. Intell. 2, 573–584 (2020).
Jiménez-Luna, J., Skalic, M., Weskamp, N. & Schneider, G. Coloring molecules with explainable artificial intelligence for preclinical relevance assessment. J. Chem. Inf. Model. 61, 1083–1094 (2021).
Plowright, A. et al. Hypothesis driven drug design: improving quality and effectiveness of the design-make-test-analyse cycle. Drug Discov. Today 17, 56–62 (2011).
Koren, Y., Bell, R. & Volinsky, C. Matrix factorization techniques for recommender systems. Computer 42, 30–37 (2009).
Chen, Z. & Wang, S. A review on matrix completion for recommender systems. Knowl. Inf. Syst. 64, 1–34 (2022).
Gogna, A. & Majumdar, A. Matrix completion incorporating auxiliary information for recommender system design. Expert Syst. Appl. 42, 5789–5799 (2015).
Weisfeiler, B. Y. & Lehman, A. A. A reduction of a graph to a canonical form and an algebra arising during this reduction [Russian]. Nauchno-Technicheskaya Informatsia 2, 12–16 (1968).
Acknowledgements
M. Hilleke, I. Pachón Angona, L. Cotos Muñoz, A. Sotiropoulou, J. Ledergerber, H. Wetton, M. Iff, C. Schiebroek, F. Lohmann, A. Ilnicka, P. Schneider, C. Isert and K. Atz are thanked for helpful discussions. This research was supported by the Swiss National Science Foundation (grants 205321_182176, 1-007655-000 and P500PT_214430).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
G.S. declares a potential financial conflict of interest as co-founder of inSili.com, Zurich, and in his role as scientific consultant to the pharmaceutical industry. The other authors declare no competing interests.
Peer review
Peer review information
Nature Machine Intelligence thanks Oliver Wieder, Paul Wrede and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Allenspach, S., Hiss, J.A. & Schneider, G. Neural multi-task learning in drug design. Nat Mach Intell 6, 124–137 (2024). https://doi.org/10.1038/s42256-023-00785-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s42256-023-00785-4
- Springer Nature Limited