Abstract
Conventional drug discovery methods rely primarily on in-vitro experiments with a target molecule and an extensive set of small molecules to choose the suitable ligand. The exploration space for the selected ligand being huge; this approach is highly time-consuming and requires high capital for facilitation. Virtual screening, a computational technique used to reduce this search space and identify lead molecules, can speed up the drug discovery process. This paper proposes a ligand-based virtual screening method using an artificial neural network called self-organizing map (SOM). The proposed work uses two SOMs to predict the active and inactive molecules separately. This SOM based technique can uniquely label a small molecule as active, inactive, and undefined as well. This can reduce the number of false positives in the screening process and improve recall; compared to support vector machine and random forest based models. Additionally, by exploiting the parallelism present in the learning and classification phases of a SOM, a graphics processing unit (GPU) based model yields much better execution time. The proposed GPU-based SOM tool can successfully evaluate a large number of molecules in training and screening phases. The source code of the implementation and related files are available at https://github.com/jayarajpbalakrishnan/2_SOM_SCREEN
Similar content being viewed by others
Data availability
References
Schierz AC (2009) Virtual screening of bioassay data. J Cheminf 1(21):1–12
Trevor H, Tibshirani R, Friedman J (2008) The elements of statistical learning data mining. Inference and prediction, 2nd edn. Springer, New York
Alpaydin E (2020) Introduction to machine learning. MIT Press, New York
Chen B, Harrison RF, Papadatos G, Willett P, Wood DJ, Lewell XQ, Greenidge P, Stiefl N (2007) Evaluation of machine-learning methods for ligand-based virtual screening. J Comput Aided Mol Des 21:53–62
Jayaraj PB, Ajay MK, Nufail M, Gopakumar G, Jaleel UC (2016) GPURFSCREEN: a GPU based virtual screening tool using random forest classifier. J Cheminf 8(1):1–13
Jayaraj PB, Jain S (2019) Ligand based virtual screening using SVM on GPU. Comput Biol Chem 83(1):107143
Kirk DB, Hwu WW (2007) Programming massively parallel processors—a hands-on approach. Morgan Kaufmann Publishers Inc., San Francisco
Ripphausen P, Nisius B, Bajorath J (2011) State-of-the-art in ligand-based virtual screening. Drug Discov Today 16(9):372–376
Burbidge R, Trotter M, Buxton B, Holden S (2001) Drug design by machine learning: support vector machines for pharmaceutical data analysis. Comput Chem 26(1):5–14
Ekins S, Mestres J, Testa B (2007) in silico pharmacology for drug discovery: methods for virtual ligand screening and profiling. Br J Pharmacol 152:9–20
Unterthiner T, Mayr A, Klambauer G, Steijaert M, Wegner JK, Ceulemans H, Hochreiter S (2014) Deep learning as an opportunity in virtual screening. Adv Neural Inf Process Syst 27:1–9
Upul S, Prabuddha R, Ragel R (2013) Machine learning based search space optimisation for drug discovery. In: Proceedings of the IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, vol 13, pp 1–13
Selzer P, Ertl P (2006) Applications of self organizing neural networks in virtual screening and diversity selection. J Chem Inf Model 46(6):2319–2323
Kohonen T (2005) The self organizing map, descriptor generation, data analysis and hit evaluation. J Chem Inf Model 45(2):515–522
Hristozov D, Oprea TI, Gasteiger J (2007) Ligand-based virtual screening by novelty detection with self organizing maps. J Chem Inf Model 47(6):2044–2062
Hyoung-joo L, Sungzoon C (2005) SOM-based novelty detection using novel data. In: Proceedings of the International Conference on Intelligent Data Engineering and Automated Learning, Springer, pp 359–366
Guthikond SM (2005) Kohonen self organizing maps. Wittenberg University, Springfield
Kohonen’s Self Organizing Feature Maps. http://www.ai-junkie.com/ann/som/som1.html, Accessed 29 April 2020
Fausett LV (1993) Fundamentals of neural networks: architectures. Algorithms and applications. Pearson Prentice Hall, Hoboken
Hung C, Huang JJ (2011) Mining rules from one-dimensional self organizing maps. In: Proceedings of the IEEE International Symposium on Innovations in Intelligent Systems and Applications, pp 292–295
Mayer R, Robert N, Doris B, Andreas R (2007) Analytic comparison of self organising maps. In: Proceedings of 7th International Workshop on Self Organizing Maps (WSOM), pp 182–190
Kohonen T 2000) Self organization of a massive document collection. IEEE Trans Neural Networks 11(3):574–585
Vesanto J, Alhoniemi E (2000) Clustering of the self organizing map. IEEE Trans Neural Netw 11(3):586–600
Kohonen T (1990) The self organizing map. Proc IEEE 78(9):1464–1480
Guillaume B, Desdouits N, Ferber M, Blondel A, Nilges M (2015) An automatic tool to analyze and cluster macromolecular conformations based on self organizing maps. Bioinformatics 31(19):1490–1492
Almendra V, Denis E (2013) Using self organizing maps for fraud prediction at on-line auction sites. In: Proceedings of the 15th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, pp 281–288
Andreas Z, Bayer H, Bauknecht H (1994) Similarity analysis of molecules with self organizing surfaces—an extension of the self organizing map. In: Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN’94), vol 2, pp 719–724
Pimentel MAF, Clifton DA, Clifton Lei, Tarassenko Lionel (2014) A review of novelty detection. Signal Process 99:215–249
Parisi GI, Stefan W (2013) Hierarchical SOM-based detection of novel behavior for 3D human tracking. In: Proceedings of the The IEEE International Joint Conference on Neural Networks (IJCNN), pp 1–8
Peter W, Chao GS, Soo LI, Li Z (2015) Somoclu: an efficient parallel library for self organizing maps. University of Boras, Technical report
Raghavendra DP (2008) SOMGPU: an unsupervised pattern classifier on graphical processing unit. In: Proceedings of the IEEE World Congress on Computational Intelligence, Evolutionary Computation, pp 1011–1018
Myklebust G, Solheim JG (1995) Parallel self organizing maps for actual applications. In: Proceedings of IEEE International Conference on Neural Networks, pp 1054–1059
Sabine M, Robert S, Gregory H, Andrew M, Richard H (2012) Scalability of self organizing maps on a gpu cluster using OpenCL and CUDA. J Phys: Conf Ser 341:012–018
Davidson G (2007) A parallel implementation of the self organising map using OpenCL. School of Computer Science, University of Glasgow, Thesis
Khan SQ, Ismail MA (2013) Design and implementation of parallel SOM model on GPGPU. In: Proceedings of the 5th IEEE International Conference on Computer Science and Information Technology, pp 233–237
Kim S et al (2019) PubChem 2019 update: improved access to chemical data. Nucleic Acids Res 47(D1):D1102–D1109
NCBI PubChem (2008) https://pubchem.ncbi.nlm.nih.gov/. Accessed 20 June 2018
PowerMv Molecular Viewer (2008) http://nisla05.niss.org/PowerMV/. Accessed 29 April 2018
Chemistry Development Kit http://cdk.sourceforge.net/. Accessed 29 April 2018
Lars R, van Deursen R, Blum LC, Reymond J-L (2012) Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J Chem Inf Model 52:2864–2875
Acknowledgements
The authors acknowledge the Department of Computer Science and Engineering, NIT Calicut, for their constant support in completing this work. We would also like to thank the Central Computer Centre for providing the GPU server for running the programs. Special thanks to Ms. Juby Johnson, Ms. Sharon Sunny, and Sonaal Pradheep, NIT Calicut, for their valuable suggestions.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jayaraj, P.B., Sanjay, S., Raja, K. et al. Ligand Based Virtual Screening Using Self-organizing Maps. Protein J 41, 44–54 (2022). https://doi.org/10.1007/s10930-021-10030-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10930-021-10030-9