Chemoinformatics Approach for the Design and Screening of Focused Virtual Libraries

  • Muthukumarasamy Karthikeyan
  • Renu Vyas


It is challenging to handle a large volume of molecular data without appropriate tools. Here, we describe the need and the approaches for the development of focussed virtual libraries to design efficient molecules and optimize them for lead generation. The experimental chemists and biologists are more interested in properties of chemicals and their response to biological system in both beneficial and adverse effects context rather than just their structures. In this chapter, the focus is to relate newly designed chemical structures to their predicted activity, property or toxicity. Property prediction tools save time, money and lives of experimental animals. They come in handy while taking informed decisions especially in certain cases involving pharmacodynamic studies of drug molecules in humans where there are inevitable ethical and safety concerns. Property prediction is an important component in virtual screening which is at the heart of drug design and the most important step where chemoinformatics plays a major role. The other fields where structure–activity relation-based principles hold good for virtual screening are agrochemicals and environmental science, specifically the toxicity and biodegradability prediction of pollutant molecules. In this chapter, we will show how to design software tools to handle generation of focussed virtual libraries from a given set of molecules with common features, fragments or bioactivity spectrum.


Descriptors Chemical properties Chemoinformatics Drug design 


  1. 1.
    Leo A, Hansch C, Church C (1969) Comparison of parameters currently used in the study of structure-activity relationships. J Med Chem 12:766–771CrossRefGoogle Scholar
  2. 2.
    Admason GW, Bawdon D (1976) An empirical method of structure-activity correlation for polysubstituted cyclic compounds using wiswesser line notation. J Chem Inf Comput Sci 16(3):161–165CrossRefGoogle Scholar
  3. 3.
    Choplin, F (1990) Computers and the medicinal chemist. In: Hansch C, Sammes PG, Taylor JB (eds) Comprehensive Medicinal Chemistry Pergamon Press, UK 4:33–58Google Scholar
  4. 4.
    Tropsha A, Gramatica P, Gombar V (2003) The importance of being earnest: validation is the absolute essential for successful application and interpretation of QSPR models. Mol Inform 22(1):69–77Google Scholar
  5. 5.
  6. 6.
    Seybold PG, May M, Bagel UA (1987) Molecular structure property relationships. J Chem Educ 64(7):575CrossRefGoogle Scholar
  7. 7.
    Todeschini R, Consonni V (2009) Molecular descriptors for chemoinformatics, vol 2. Wiley-VCHGoogle Scholar
  8. 8.
    Karelson M (2000) Molecular descriptors in QSAR/QSPR. WileyGoogle Scholar
  9. 9.
  10. 10.
  11. 11.
    Balaban AT (1997) From chemical topology to three dimensional geometry. Plenum Press, New York, 1–24Google Scholar
  12. 12.
    Karelson M, Lobanov V, Katritzky AR (1996) Quantum chemical descriptors in QSAR/QSPR studies. Chem Rev 96:1027–1043CrossRefGoogle Scholar
  13. 13.
    Enoch SJ (2010)The use of quantum mechanics derived descriptors in computational toxicology. In: Puzyn T et al (ed) Challenges and advances in computational chemistry and physics, vol 8. Springer Science pp 24–27Google Scholar
  14. 14.
    Stanton D (1999) Evaluation and use of BCUT descriptors in QSAR and QSPR studies. J Chem Inf Comput Sci 39(1):11–20CrossRefGoogle Scholar
  15. 15.
    Ma SL, Joung JY, Lee S, Cho KH, No KT (2012) PXR ligand classification model with SFED weighted WHIM and CoMMA descriptors. SAR QSAR Environ Res 23(5–6):485–504CrossRefGoogle Scholar
  16. 16.
  17. 17.
    Todeschini R, Bettiol C, Giurin G, Gramatica P, Miana P, Argese E (1996) Modeling and prediction by using WHIM descriptors in QSAR studies: submitochondrial particles(SMP) as toxicity biosensors of chlorophenols. Chemosphere 33:71–79CrossRefGoogle Scholar
  18. 18.
    Hinselmann G, Rosenbaum L, Jahn A, Fechner N, Zell AJ (2011) Compound Mapper: an open source JAVA library and command line tool for chemical fingerprints. J Chemoinformatics 3:3CrossRefGoogle Scholar
  19. 19.
    Rogers D, Mathew H(2010) Extended connectivity fingerprints. J Chem Inf Model 50(5):742–754CrossRefGoogle Scholar
  20. 20.
    Bender A, Hamse Y, Mussa HY, Glen C (2010) Similarity searching of chemical databases using atom environment descriptors (Molprint 2D) evaluation of performance. J Chem Inf Comput Sci 44:1708–1718CrossRefGoogle Scholar
  21. 21.
    Deursen R, Blum Lorenz CB, Reymond JL (2010) A searchable map of PubChem. J Chem Inf Model 50(11):1924–1934CrossRefGoogle Scholar
  22. 22.
    Chemscreener unpublished resultsGoogle Scholar
  23. 23.
    Jorgenson WL, Duffy EM (2002) Prediction of drug solubility from structure. Adv Drug Deliv Rev 54:355–366CrossRefGoogle Scholar
  24. 24.
    Livingstone DJ, Waterbeemd VD, Han I (2009) In silico prediction of human oral bioavailability. Method Prin Med Chem 40:433–451CrossRefGoogle Scholar
  25. 25.
    Persson LC, Porter CJ, Charman WN, Bergstrom CA (2013) Computational prediction of drug solubility in lipid based formulation excipients. Pharm Res PMID:23771564Google Scholar
  26. 26.
    Faller B, Ertl P (2007) Computational approaches to determine drug solubility. Adv Drug Deliv Rev 59:533–545CrossRefGoogle Scholar
  27. 27.
    Cortes-Cabrera A, Morris GM, Finn PW, Morreale A, Gago F (2013) Comparison of ultra fast 2D and 3D descriptors for side effect prediction and network analysis in polypharmacology. Br J Pharmacol. doi:10.1111/bph.12294Google Scholar
  28. 28.
    Rice BM, Byrd EF (2013) Evaluation of electrostatic descriptors for crystalline density. LangmuirGoogle Scholar
  29. 29.
    Garcia EJ, Pellitero PJ, Jallut C, Pirngruber GD (2013) Modeling adsorption properties on the basis of microscopic, molecular structural descriptors for non polar adsorbents. J Chem Inf ModelGoogle Scholar
  30. 30.
    Wegner JK, Zell A (2003) Prediction of aqueous solubility and partition coefficient optimized by genetic algorithm based descriptors selection method. J Chem Inf Comput Sci 43(3):1077–1084CrossRefGoogle Scholar
  31. 31.
    Steinbeck C, Hoppe C, Kuhn S, Matteo F, Guha R, Willighagen EL (2006) Recent development of the CDK (Chemistry Development Kit) an open source JAVA library for chemo and bioinformatics. Curr Pharm Design 12(17):2111–2120CrossRefGoogle Scholar
  32. 32.
  33. 33.
    Steinbeck C (2008) Open toolkits and applications for chemoinformatics teaching Abstracts of Papers, 235th ACS National Meeting, New Orleans, LA, United States, April 6–10Google Scholar
  34. 34.
  35. 35.
    Yap CW (2011) Padel descriptor an open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32(7):1466–1474CrossRefGoogle Scholar
  36. 36.
  37. 37.
    Liu K, Feng J, Young SS (2005) A software environment for molecular viewing, descriptor generation, data analysis and hit evaluation. J Chem Inf Model 45(2):515–522CrossRefGoogle Scholar
  38. 38.
  39. 39.
  40. 40.
    Xueliang L, Yongtang S, Wang L (2012) On a relation between randic index and algebraic connectivity. Match 68(3):843–839Google Scholar
  41. 41.
    Ivanciuc O, Ivanciuc T, Douglas KJ, William SA, Balaban T (2001) Wiener index extension by counting even/odd graph distances. J Chem Inf Model 41(3):536–549CrossRefGoogle Scholar
  42. 42.
    Benet LZ, Broccatelli F, Oprea TI (2011) BDDCS applied to over 900 drugs. AAPS J 13(4):519–547CrossRefGoogle Scholar
  43. 43.
    Lu D, Chambers P, Wipf P, Xie X-Q, Englert D, Weber S (2012) Lipophilicity screening of novel drug like compounds and comparison to clogp. J Chromatogr A 1258:161–167CrossRefGoogle Scholar
  44. 44.
  45. 45.
    QikProp (2012) version 3.5, Schrödinger, LLC, New YorkGoogle Scholar
  46. 46.
    Kerns E, Li D (2010) Drug like properties, concepts, structure design and methods. Academic PressGoogle Scholar
  47. 47.
    LigPrep (2012) version 2.5, Schrödinger, LLC, New YorkGoogle Scholar
  48. 48.
    Molecular Operating Environment (MOE) (2012)10; Chemical Computing Group Inc., 1010 Montreal, QC, Canada, H3A 2R7, 2012Google Scholar
  49. 49.
    Gerardo CMM, Yovani MP, Khan MTH, Arjumand A, Khan KM, Torrens F, Rotondo R (2007) Dragon method for finding novel tyrosinase inhibitors biosilico identification and experimental in vitro assays. Eur J Med Chem 42(11–12):1370–1381Google Scholar
  50. 50.
  51. 51.
    Karthikeyan M, Krishnan S, Pandey AK, Bender A, Tropsha A (2008) Distributed chemical computing using ChemStar: An open source java remote method invocation architecture applied to large scale molecular data from pubchem. J Chem Inf Model 48(4):691–703CrossRefGoogle Scholar
  52. 52.
  53. 53.
  54. 54.
    Lusci A, Pollastri G, Baldi P (2013) Deep architectures and deep learning in Chemoinformatics: the prediction of aqueous solubility for drug like molecules. J Chem Inf Model 53(7):1563–1575CrossRefGoogle Scholar
  55. 55.
    Sorana BD, Lorentz J (2011) Predictivity approach for quantitative structure prediction models: application for blood barrier permeation for diverse drug like compounds. Int J Mol Sci 12(7):4348–4386Google Scholar
  56. 56.
  57. 57.
  58. 58.
  59. 59.
    Ulrich A, Koch C, Speitling M, Hansske FG (2002) Modern methods to produce natural-product libraries. Curr Opin Chem Biol 6(4):453–458CrossRefGoogle Scholar
  60. 60.
    Bemis GW, Murcko MA (1999) Properties of known drugs, 2: Side chains. J Med Chem 42(25):5095–5099CrossRefGoogle Scholar
  61. 61.
    Wetzel S, Karsten K, Renner S, Rauh D, Oprea TI, Mutzel P, Waldmann H (2009) Interactive exploration of chemical space with scaffold hunter. Nat Chem Biol 5(9):696CrossRefGoogle Scholar
  62. 62.
  63. 63.
    Van Drie JH (2009) ReCore. J Am Chem Soc 131(4):1617Google Scholar
  64. 64.
  65. 65.
    Core Hopping (2011), version 1.1, Schrödinger, LLC, New YorkGoogle Scholar
  66. 66.
    Schuller A, Hahnke V, Schneider G (2007) SmiLib v2.0: A Java-Based Tool for Rapid Combinatorial Library Enumeration. QSAR Comb Sci 3:407–410CrossRefGoogle Scholar
  67. 67.
  68. 68.
  69. 69.
    Tropsha A (2008) Integrated chemo and bioinformatics approaches to virtual screening. In: Tropsha A, Varnek A (ed) Chemoinformatics approaches to virtual screening. SC Publishing, pp 295–325Google Scholar
  70. 70.
    Perola E, Xu K, Kollmeyer TM, Kaufmann SH, Prendergast FG, Pang Y-P (2000) Successful virtual screening of a chemical database for farnesyl transferase inhibitor leads. J Med Chem 43(3):401–408CrossRefGoogle Scholar
  71. 71.
    Oprea TI (2002) Virtual screening in lead discovery a viewpoint. Molecules 7:51–62CrossRefGoogle Scholar
  72. 72.
    Unpublished resultsGoogle Scholar
  73. 73.
  74. 74.
  75. 75.
  76. 76.

Copyright information

© Springer India 2014

Authors and Affiliations

  1. 1.Digital Information Resource CentreNational Chemical LaboratoryPuneIndia
  2. 2.Scientist (DST) Division of Chemical Engineering and Process DevelopmentNational Chemical LaboratoryPuneIndia

Personalised recommendations