Turbo prediction: a new approach for bioactivity prediction

Abdo, Ammar; Pupin, Maude

doi:10.1007/s10822-021-00440-3

Turbo prediction: a new approach for bioactivity prediction

Published: 21 January 2022

Volume 36, pages 77–85, (2022)
Cite this article

Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

491 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Nowadays, activity prediction is key to understanding the mechanism-of-action of active structures discovered from phenotypic screening or found in natural products. Machine learning is currently one of the most important and rapidly evolving topics in computer-aided drug discovery to identify and design new drugs with superior biological activities. The performance of a predictive machine learning model can be enhanced through the optimal selection of learning data, algorithm, algorithm parameters, and ensemble methods. In this article, we focus on how to enhance the prediction model using the learning data. However, get an option to add more and accurate data is not easy and available in many cases. This motivated us to propose the turbo prediction model, in which nearest neighbour structures are used to increase prediction accuracy. Five datasets, well known in the literature, were used in this article and experimental results show that turbo prediction can improve the quality prediction of the conventional prediction models, particularly for heterogeneous datasets, without any additional effort on the part of the user carrying out the prediction process, and at a minimal computational cost.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence in Biological Activity Prediction

In-silico target prediction by ensemble chemogenomic model based on multi-scale information of chemical structures and protein sequences

Article Open access 23 April 2023

Machine learning accelerates pharmacophore-based virtual screening of MAO inhibitors

Article Open access 08 April 2024

References

Whitebread S, Hamon J, Bojanic D, Urban L (2005) Keynote review: in vitro safety pharmacology profiling: an essential tool for successful drug development. Drug Discov Today 10:1421–1433. https://doi.org/10.1016/S1359-6446(05)03632-9
Article CAS PubMed Google Scholar
Haggarty SJ, Koeller KM, Wong JC et al (2003) Multidimensional chemical genetic analysis of diversity-oriented synthesis-derived deacetylase inhibitors using cell-based assays. Chem Biol 10:383–396. https://doi.org/10.1016/s1074-5521(03)00095-4
Article CAS PubMed Google Scholar
Manly CJ, Louise-May S, Hammer JD (2001) The impact of informatics and computational chemistry on synthesis and screening. Drug Discov Today 6:1101–1110. https://doi.org/10.1016/S1359-6446(01)01990-0
Article CAS PubMed Google Scholar
Hopkins AL (2009) Drug discovery: Predicting promiscuity. Nature 462:167–168. https://doi.org/10.1038/462167a
Article CAS PubMed Google Scholar
Jenkins JL, Bender A, Davies JW (2006) In silico target fishing: predicting biological targets from chemical structure. Drug Discov Today Technol 3:413–421. https://doi.org/10.1016/j.ddtec.2006.12.008
Article Google Scholar
Mathai N, Kirchmair J (2020) Similarity-based methods and machine learning approaches for target prediction in early drug discovery: performance and scope. Int J Mol Sci. https://doi.org/10.3390/ijms21103585
Article PubMed PubMed Central Google Scholar
Ding H, Takigawa I, Mamitsuka H, Zhu S (2014) Similarity-based machine learning methods for predicting drug-target interactions: a brief review. Brief Bioinform 15:734–747. https://doi.org/10.1093/bib/bbt056
Article PubMed Google Scholar
Wang C, Kurgan L (2019) Survey of Similarity-based Prediction of Drug-protein Interactions. Curr Med Chem. https://doi.org/10.2174/0929867326666190808154841
Article PubMed PubMed Central Google Scholar
Wang C, Kurgan L (2019) Review and comparative assessment of similarity-based methods for prediction of drug-protein interactions in the druggable human proteome. Brief Bioinform 20:2066–2087. https://doi.org/10.1093/bib/bby069
Article CAS PubMed Google Scholar
Lo Y-C, Rensi SE, Torng W, Altman RB (2018) Machine learning in chemoinformatics and drug discovery. Drug Discov Today 23:1538–1546. https://doi.org/10.1016/j.drudis.2018.05.010
Article CAS PubMed PubMed Central Google Scholar
Geppert H, Vogt M, Bajorath J (2010) Current trends in ligand-based virtual screening: molecular representations, data mining methods, new application areas, and performance evaluation. J Chem Inf Model 50:205–216. https://doi.org/10.1021/ci900419k
Article CAS PubMed Google Scholar
Belkin NJ, Cool C, Croft WB, Callan JP (1993) The effect multiple query representations on information retrieval system performance. In: proceedings of the 16th annual international ACM SIGIR conference on research and development in information retrieval; SIGIR ’93; ACM. New York, pp 339–346
Xue L, Godden JW, Stahura FL, Bajorath J (2003) Profile scaling increases the similarity search performance of molecular fingerprints containing numerical descriptors and structural keys. J Chem Inf Comput Sci 43:1218–1225. https://doi.org/10.1021/ci030287u
Article CAS PubMed Google Scholar
Sheridan RP (2000) The centroid approximation for mixtures: calculating similarity and deriving structure−activity relationships. J Chem Inf Comput Sci 40:1456–1469. https://doi.org/10.1021/ci000045j
Article CAS PubMed Google Scholar
Shemetulskis NE, Weininger D, Blankley CJ et al (1996) Stigmata: an algorithm to determine structural commonalities in diverse datasets. J Chem Inf Comput Sci 36:862–871. https://doi.org/10.1021/ci950169
Article CAS PubMed Google Scholar
Hert J, Willett P, Wilton DJ et al (2004) Comparison of fingerprint-based methods for virtual screening using multiple bioactive reference structures. J Chem Inf Comput Sci 44:1177–1185. https://doi.org/10.1021/ci034231b
Article CAS PubMed Google Scholar
Abdo A, Salim N (2009) Similarity-based virtual screening using Bayesian inference network: enhanced search using 2D fingerprints and multiple reference structures. QSAR Comb Sci 28:654–663. https://doi.org/10.1002/qsar.200860155
Article CAS Google Scholar
Chen B, Harrison RF, Papadatos G et al (2007) Evaluation of machine-learning methods for ligand-based virtual screening. J Comput Aided Mol Des 21:53–62. https://doi.org/10.1007/s10822-006-9096-5
Article CAS PubMed Google Scholar
Geppert H, Horváth T, Gärtner T et al (2008) Support-vector-machine-based ranking significantly improves the effectiveness of similarity searching using 2D fingerprints and multiple reference compounds. J Chem Inf Model 48:742–746. https://doi.org/10.1021/ci700461s
Article CAS PubMed Google Scholar
Salton G, Buckley C (1990) Improving retrieval performance by relevance feedback. J Am Soc Inform Sci 41:288–297. https://doi.org/10.1002/(SICI)1097-4571(199006)41:4
Article Google Scholar
Ruthven I, Lalmas M (2003) A survey on the use of relevance feedback for information access systems. Knowl Eng Rev 18:95–145. https://doi.org/10.1017/S0269888903000638
Article Google Scholar
Hert J, Willett P, Wilton DJ et al (2006) New Methods for ligand-based virtual screening: use of data fusion and machine learning to enhance the effectiveness of similarity searching. J Chem Inf Model 46:462–470. https://doi.org/10.1021/ci050348j
Article CAS PubMed Google Scholar
Abdo A, Salim N, Ahmed A (2011) Implementing relevance feedback in ligand-based virtual screening using Bayesian inference network. J Biomol Screen 16:1081–1088. https://doi.org/10.1177/1087057111416658
Article CAS PubMed Google Scholar
Gardiner EJ, Gillet VJ, Haranczyk M et al (2009) Turbo similarity searching: effect of fingerprint and dataset on virtual-screening performance. Stat Anal Data Min 2:103–114. https://doi.org/10.1002/sam.10037
Article Google Scholar
Abdo A, Saeed F, Hamza H et al (2012) Ligand expansion in ligand-based virtual screening using relevance feedback. J Comput Aided Mol Des 26:279–287. https://doi.org/10.1007/s10822-012-9543-4
Article CAS PubMed Google Scholar
Hert J, Willett P, Wilton DJ et al (2005) Enhancing the effectiveness of similarity-based virtual screening using nearest-neighbor information. J Med Chem 48:7049–7054. https://doi.org/10.1021/jm050316n
Article CAS PubMed Google Scholar
Kogej T, Engkvist O, Blomberg N, Muresan S (2006) Multifingerprint based similarity searches for targeted class compound selection. J Chem Inf Model 46:1201–1213. https://doi.org/10.1021/ci0504723
Article CAS PubMed Google Scholar
Xue L, Godden JW, Bajorath J (2000) Evaluation of descriptors and mini-fingerprints for the identification of molecules with similar activity. J Chem Inf Comput Sci 40:1227–1234. https://doi.org/10.1021/ci000327j
Article CAS PubMed Google Scholar
Xue L, Stahura FL, Godden JW, Bajorath J (2001) Mini-fingerprints detect similar activity of receptor ligands previously recognized only by three-dimensional pharmacophore-based methods. J Chem Inf Comput Sci 41:394–401. https://doi.org/10.1021/ci000305x
Article CAS PubMed Google Scholar
Alberga D, Trisciuzzi D, Montaruli M et al (2019) A New approach for drug target and bioactivity prediction: the multifingerprint similarity search algorithm (MuSSeL). J Chem Inf Model 59:586–596. https://doi.org/10.1021/acs.jcim.8b00698
Article CAS PubMed Google Scholar
Montaruli M, Alberga D, Ciriaco F et al (2019) Accelerating drug discovery by early protein drug target prediction based on a multi-fingerprint similarity search. Molecules 24:2233. https://doi.org/10.3390/molecules24122233
Article CAS PubMed Central Google Scholar
Ciriaco F, Gambacorta N, Alberga D, Nicolotti O (2021) Quantitative polypharmacology profiling based on a multifingerprint similarity predictive approach. J Chem Inf Model. https://doi.org/10.1021/acs.jcim.1c00498
Article PubMed Google Scholar
Willett P (2006) Similarity-based virtual screening using 2D fingerprints. Drug Discovery Today 11:1046–1053. https://doi.org/10.1016/j.drudis.2006.10.005
Article CAS PubMed Google Scholar
Abdo A, Chen B, Mueller C et al (2010) Ligand-based virtual screening using Bayesian networks. J Chem Inf Model 50:1012–1020. https://doi.org/10.1021/ci100090p
Article CAS PubMed Google Scholar
Johnson MA, Maggiora GM (1990) Concepts and applications of molecular similarity. Wiley, New York
Google Scholar
Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2830
Google Scholar
Fan R-E, Chang K-W, Hsieh C-J et al (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874
Google Scholar
Goldberger J, Hinton GE, Roweis ST, Salakhutdinov RR (2005) Neighbourhood components analysis. In: Saul LK, Weiss Y, Bottou L (eds) Advances in neural information processing systems 17. MIT Press, pp 513–520
Google Scholar
BIOVIA Databases | Bioactivity Databases: MDDR
Rohrer SG, Baumann KMUV (2009) Data sets for virtual screening based on PubChem bioactivity data. J Chem Inf Model 49:169–184. https://doi.org/10.1021/ci8002649
Article CAS PubMed Google Scholar
Huang N, Shoichet BK, Irwin JJ (2006) Benchmarking sets for molecular docking. J Med Chem 49:6789–6801. https://doi.org/10.1021/jm0608356
Article CAS PubMed PubMed Central Google Scholar
Pipeline Pilot Scientific Application Overview | Dassault Systèmes BIOVIA
O’Boyle NM, Sayle RA (2016) Comparing structural fingerprints using a literature-based similarity benchmark. J Cheminform. https://doi.org/10.1186/s13321-016-0148-0
Article PubMed PubMed Central Google Scholar
Pan S, Wu J, Zhu X et al (2017) Task sensitive feature exploration and learning for multitask graph classification. IEEE Transactions on Cybernetics 47:744–758. https://doi.org/10.1109/TCYB.2016.2526058
Article Google Scholar
Pan S, Wu J, Zhu X et al (2015) Finding the best not the most: regularized loss minimization subgraph selection for graph classification. Pattern Recognit. https://doi.org/10.1016/j.patcog.2015.05.019
Article Google Scholar
Swets JA (1988) Measuring the accuracy of diagnostic systems. Science 240:1285–1293. https://doi.org/10.1126/science.3287615
Article CAS PubMed Google Scholar
Triballeau N, Acher F, Brabet I et al (2005) Virtual screening workflow development guided by the “Receiver Operating Characteristic” curve approach. application to high-throughput docking on metabotropic glutamate receptor subtype 4. J Med Chem 48:2534–2547. https://doi.org/10.1021/jm049092j
Article CAS PubMed Google Scholar
Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1:80–83. https://doi.org/10.2307/3001968
Article Google Scholar
Siegel S, Castellan Jr. NJ (1988) Nonparametric statistics for the behavioral sciences. In: Edd ED (ed) Nonparametric statistics for the behavioral sciences, 2nd edn. Mcgraw-Hill Book Company, New York
Google Scholar

Download references

Acknowledgements

This work was supported by Lille University, CNRS and Programme national d’aide à l’Accueil en Urgence des Scientifiques en Exil (PAUSE).

Author information

Authors and Affiliations

CNRS, Centrale Lille, UMR 9189 CRIStAL, University of Lille, 59000, Lille, France
Ammar Abdo & Maude Pupin
Computer Science Department, Hodeidah University, Draheemy Street, Hodeidah, Yemen
Ammar Abdo

Authors

Ammar Abdo
View author publications
You can also search for this author in PubMed Google Scholar
Maude Pupin
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The research was conducted by mutual contributions of all authors. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Ammar Abdo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 35 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abdo, A., Pupin, M. Turbo prediction: a new approach for bioactivity prediction. J Comput Aided Mol Des 36, 77–85 (2022). https://doi.org/10.1007/s10822-021-00440-3

Download citation

Received: 17 June 2021
Accepted: 17 December 2021
Published: 21 January 2022
Issue Date: January 2022
DOI: https://doi.org/10.1007/s10822-021-00440-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Turbo prediction: a new approach for bioactivity prediction

Abstract

Access this article

Similar content being viewed by others

Artificial Intelligence in Biological Activity Prediction

In-silico target prediction by ensemble chemogenomic model based on multi-scale information of chemical structures and protein sequences

Machine learning accelerates pharmacophore-based virtual screening of MAO inhibitors

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (DOCX 35 kb)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Turbo prediction: a new approach for bioactivity prediction

Abstract

Access this article

Similar content being viewed by others

Artificial Intelligence in Biological Activity Prediction

In-silico target prediction by ensemble chemogenomic model based on multi-scale information of chemical structures and protein sequences

Machine learning accelerates pharmacophore-based virtual screening of MAO inhibitors

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (DOCX 35 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation