Skip to main content

Advertisement

Log in

Molecular similarity and diversity in chemoinformatics: From theory to applications

  • Reveiw
  • Published:
Molecular Diversity Aims and scope Submit manuscript

Abstract

This review is dedicated to a survey on molecular similarity and diversity. Key findings reported in recent investigations are selectively highlighted and summarized. Even if this overview is mainly centered in chemoinformatics, applications in other areas (pharmaceutical and medical chemistry, combinatorial chemistry, chemical databases management, etc.) are also introduced. The approaches used to define and descript the concepts of molecular similarity and diversity in the context of chemoinformatics are discussed in the first part of this review. We introduce, in the second and third parts, the descriptions and analyses of different methods and techniques. Finally, current applications and problems are enumerated and discussed in the last part.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Abbreviations

AAB:

Advanced Algorithm Builder

ADMET:

absorption, distribution, metabolism, excretion and toxicity

ANN:

Artificial Neural Networks

BCUT:

Burden CAS University of Texas (topological descriptors)

CART:

Classification And Regression Tree

CAS:

Chemical Abstract Service (American Chemical Society)

CLIP program:

Candidate Ligand Identification Program

CODESSA:

COmprehensive DEscriptors for Structural and Statistical Analysis

CoMFA:

Comparative Molecular Field Analysis

CoMSIA:

Comparative Molecular Similarity Indices Analysis

CPU:

Central Processor Unit

CSA:

Cluster Significance Analysis

CSS:

Common SubStructure

DARC:

Description, Acquisition, Restitution, Computer-aided design

DF:

Density Function

DISSIM:

Statistical module to calculate the DISSIMilarity index

DMC:

Dynamic Mapping of Consensus positions

DNA:

Desoxirribo Nucleic Acid

DRAGON:

Software for the calculation of molecular descriptors

FREL:

Fragments Reduced to an Environment that is Limited

FM:

Fragmental Methods

FO:

Focus

GETAWAY:

GEometry; Topology and Atom-Weights AssemblY

GMLC:

Gaussian Maximum Likelihood Classification

GTM:

Generative Topographic Mapping

HIV:

Human Immunodeficiency Virus

HTS:

High Throughput Screening

HTSS:

Hierarchic Tree Substructure Search Systems

HOMO-LUMO:

Highest Occupied Molecular Orbital – Lowest Unoccupied Molecular Orbital

IR:

InfraRed

IUPAC:

International Union of Pure and Applied Chemistry

KNN:

K-Nearest Neighbors

LaSSI:

Latent Semantic Structure Indexing

LDA:

Linear Discriminant Analysis

MACCS:

Substructure search system from CambridgeSoft Corporation

MaP:

Mapping Property distributions of molecular surfaces

MDDR:

MDL Drug Data Report

MEP:

Molecular electrostatic Potential

MCSS:

Maximal Common Sub-Structure

MQS:

Molecular Quantum Similarity

NMC:

Nearest Mean Classifier

PCA:

Principal Component Analysis

Pm:

molecular Property characteristic

QQSPR or Q2SPR:

Quantum Quantitative Structure-Property Relationship

QSAR:

Quantitative Structure-Activity Relationship

QSPR:

Quantitative Structure-Property Relationship

QSCD:

Quantized Surface Complementarity Diversity

RDF:

Radial Distribution Function

RMS:

Root Mean Square

S4:

Substructure search software (Beilstein Institute of Organic Chemistry & Softron Ltd)

SIMCA:

Soft Independent Modeling of Class Analogy

SMILES:

Simplified Molecular Input Line Entry Specification

SVM:

Support Vector Machines

ToPD:

Total Pharmacophore Diversity

UV:

UltraViolet

WLN:

Wiswesser Line Notation

WHIM:

Weighted Holistic Invariant Molecular

References

  1. Richon, A.B., A History of Computational Chemistry, Network Science (1996). Available at the following URL: http://www.netsci.org/Science/Compchem/feature17a.html

  2. Rouvray, D.H., The evolution of the concept of molecular similarity. In Johnson, M.A. and Maggiora, G.M. (Eds.) Concepts and Applications of Molecular Similarity, John Willey & Sons, New York, Inc. 1990. pp. 15–42.

    Google Scholar 

  3. Rouvray, D.H., Definition and role of similarity concepts in the chemical and physical sciences, J. Chem. Inf. Comp. Sci., 32 (1992) 580–586.

    Article  CAS  Google Scholar 

  4. Kopp, H., Ann. Chem. 41 (1842) 79. Reedited in 1954 as, Kopp, H. Ann. Annalen der Chemie und pharm, 92 (1854) 1.

  5. Richardson, B.W., Rep. Brit. Assoc. Adv. Sci. 34 (1864) 120.

  6. Wiener, H. Structural determination of Paraffin boiling points, J. Amer. Chem. Soc., 69 (1947) 17–20.

    CAS  Google Scholar 

  7. Hansch, C. and Fujita, T., r-s-p analysis – a method for the correlation of biological activity and chemical structure, J. Amer. Chem. Soc., 86 (1964) 1616–1626.

    CAS  Google Scholar 

  8. Marshall, G.R., Barry, C.D., Bosshard, H.E., Dammkoehler, R.A. and Dunn, D.A., in Computer-Assisted Drug Design. Olson E.C. and Christofferson R.E. (Eds.) American Chemical Society Symposium, Vol, 112, American Chemical Society, Washington D.C. 1979, 205–226.

  9. Tripos, Inc., 1699 South Hanley Rd. St. Louis, Missouri, 63144, USA. Information available at the following URL: http://www.tripos.com/

  10. Pavia, M.R., The chemical generation of molecular diversity, Network Science (1994). Available at the following URL: http://www.netsci.org/Science/Combichem/feature01.html.

  11. DeWitt, S.H., Kiely, J.S., Stankovic, C.J., Schroeder, M.C., Reynolds Cody, D.M. and Pavia, M.R., “Diversomers”: An approach to nonpetide, nonoligomeric chemical diversity, Proc. Natl. Acad. Sci. USA, 90 (1993) 6909–6913.

    CAS  Google Scholar 

  12. Carhart, R.E., Smith, D.H. and Venkataraghavan, R., Atom pairs as molecular features in structure-activity studies: Definitions and applications, J. Chem. Inf. Comput. Sci., 25 (1985) 64–73.

    Article  CAS  Google Scholar 

  13. Willett, P., Winterman, V. and Bawden, D., Implementation of nearest neighbor searching in an online chemical structure search system, J. Chem. Inf. Comput. Sci., 26 (1986) 36–41.

    Google Scholar 

  14. Chabala, J., et al., Historical overview of the developing field of molecular diversity, in Gordon E. M. and Kerwin, J.F. Jr. (Eds.), Combinatorial Chemistry and Molecular Diversity in Drug Discovery, Wiley & Sons, New York, 1998, pp. 3–15.

    Google Scholar 

  15. Gasteiger, J. (Ed.) Handbook of Chemoinformatics. From Data to Knowledge. Volume 1 to 4. Wiley-VCH, Germany, 2003.

    Google Scholar 

  16. Bajorath, J. (Ed.) Chemoinformatics. concepts, methods and tools for drug discovery. Methods in Molecular Biology, vol. 275. Humana Press Inc., Totowa, NJ. 2004.

  17. Johnson, A.M. and Maggiora, G.M. (Eds.) Concepts and Applications of Molecular Similarity, John Willey & Sons, New York, Inc. 1990.

    Google Scholar 

  18. Dean, P.M. (Ed.) Molecular Similarity in Drug Design, Chapman & Hall, New York, 1995.

    Google Scholar 

  19. Barbosa, F. and Horvath, D., Molecular similarity and property similarity Curr. Top. Med. Chem., 4 (2004) 589–600.

    Google Scholar 

  20. Perez, J.J., Managing molecular diversity, Chem. Soc. Rev., 34 (2005) 143–152.

    Article  CAS  Google Scholar 

  21. Bender, A. and Glen, R.C., Molecular similarity: A key technique in molecular informatics Org. Biomol. Chem., 2 (2004) 3204–3218.

    Google Scholar 

  22. Leach, A.R. and Gillet, V.J. (Eds.) An Introduction of Chemoinformatics, Kluwer Academic Publishers, 2003.

  23. Gasteiger, J. and Engel, T. (Eds.) Chemoinformatics. A Textbook, Wiley-VCH, Germany, 2003.

    Google Scholar 

  24. Moos, W.H., Green, G.D. and Pavia, M.R, Chapter 33. Recent advances in the generation of molecular diversity, Annual Reports in Medicinal Chemistry, 28 (1993) 315–324.

  25. Mason, J.S. and Hermsmeier, N.A., Diversity assessment, Curr. Op. Chem. Bio., 3 (1999) 342–349.

    CAS  Google Scholar 

  26. Warr, W.A., Commercial software systems for diversity analysis, Perspectiv. Drug Disc. Design, 7/8 (1997) 115–130.

    CAS  Google Scholar 

  27. Sadowski, J. and Kubinyi, H., A scoring scheme for discriminating between drugs and non drugs, J. Med. Chem., 41 (1998) 3325–3329.

    Article  CAS  Google Scholar 

  28. Terstappen, G.C. and Reggiani, A., In silico research in drug discovery, Trends Pharm. Sci., 22 (2001) 23–26.

    Article  CAS  Google Scholar 

  29. Wintner, E. and Moallemi, C.C., Quantized surface complementarity diversity (QSCD): A model based on small molecule-target complementarity, J. Med. Chem., 43 (2000) 1993–2006.

    Article  CAS  Google Scholar 

  30. Pearlman, R.S., Novel software tools for addressing chemical diversity, Network Science (1999). Available at the following URL: http://www.netsci.org/Science/Combichem/feature08.html

  31. Pearlman, R.S. and Smith, K.M., Novel software tools for chemical diversity, Perspectiv. Drug Disc. Design, 9/10/11 (1998) 339–353.

    CAS  Google Scholar 

  32. Bures, M.G. and Martin, Y.C., Computational methods in molecular diversity and combinatorial chemistry, Curr. Opin. Chem. Biol., 2 (1998) 376–380.

    Article  CAS  Google Scholar 

  33. Information available at the following URL: http://pearl1.lanl.gov/periodic/mendeleev.htm

  34. Makara G., Measuring molecular similarity and diversity: Total pharmacophore diversity, J. Med. Chem., 44 (2001) 3563–3571.

    Article  CAS  Google Scholar 

  35. Nikolova, N. and Jaworska, J., Approaches to measure chemical similarity – a review, QSAR Comb. Sci., 22 (2003) 1006–1026.

    Article  CAS  Google Scholar 

  36. Katritzky, A.R., Lobanov, V.S. and Karelson, M., CODESSA Reference Manual, Version 2.0, Gainville, 1996.

  37. Information available at the following URL: http://www.disat.unimib.it/chm/QSARnews2.htm

  38. Willett, P. (Ed.) Similarity and clustering in chemical information systems, Research Studies Press, Letchworth, Herts., U.K., 1987.

    Google Scholar 

  39. Pepperrell, C.A. and Willett, P., Techniques for the calculation of the three-dimensional structural similarity using inter-atomic distances, J. Comput.-Aided Mol. Design, 5 (1991) 455–474.

    Article  CAS  Google Scholar 

  40. Bossert, W., Pattanaik, P.K. and Xu, Y., Similarity of option and the measurement of diversity. Working paper published by the Center for Interuniversity Research in Quantitative Economics (CIREQ) under number 11-2002. Available at the following URL: http://www.sceco.umontreal.ca/publications/etext/2002-11.pdf

  41. Petitjean, M., Geometric molecular similarity from volume-based distance minimization: Application to saxitoxin and tetrodotoxin, J. Comput. Chem., 16 (1995) 80–90.

    Article  CAS  Google Scholar 

  42. Petitjean, M., Three-dimensional pattern recognition from molecular distance minimization, J. Chem. Inf. Comput. Sci., 36 (1996) 1038–1049.

    Article  CAS  Google Scholar 

  43. Petitjean, M., From shape similarity to shape complementarity: Toward a docking theory, J. Math. Chem., 35 (2004) 147–158.

    Article  CAS  Google Scholar 

  44. Petitjean, M., Chiral mixtures, J. Math. Phys., 43 (2002) 4147–4157.

    Article  Google Scholar 

  45. Maggiora, G.M. and Shanmugasundaram, V., Molecular similarity measures. In Bajorath, J. (Ed.) Methods in Molecular Biology, vol. 275. Chemoinformatics. Concepts, Methods and Tools for Drug Discovery. Humana Press Inc., Totowa, NJ. 2004. pp.1–50.

  46. Willett, P. and Winterman, V.A., Comparison of some measures for the determination of intermolecular structural similarity measures, Quant. Struct.-Act. Relat., 5 (1986) 18–25.

    CAS  Google Scholar 

  47. Holliday, J.D., Hu, C.Y. and Willett, P., Grouping of coefficients for the calculation of Inter-molecular similarity and dissimilarity using 2D fragment Bit-Strings, Comb. Chem. High Throughput Screening, 5 (2002) 155–166.

    CAS  Google Scholar 

  48. Haffri, Y., Chapter 1: Distance measures, INA Internal Report, Institut National de l'Audiovisuel (INA), France, 2003.

  49. Holliday, J.D., Salim, N., Whittle, M. and Willett, P., Analysis and display of the size of chemical similarity coefficients, J. Chem. Inf. Comput. Sci., 43 (2003) 819–828.

    CAS  Google Scholar 

  50. Bath, P.A., Morris, C.A. and Willett, P., Effects of standardization on fragment-based measures of structural similarity, J. Chemomet., 7 (1993) 543–550.

    CAS  Google Scholar 

  51. Brown, R.D., Descriptors for diversity analysis, Persp. Drug Disc. Design, 7/8 (1997) 31–49.

    CAS  Google Scholar 

  52. Todeschini, R. and Consonni, V., Handbook of molecular descriptors, in Mannhold, R., Kubinyi, H. and Timmerman, H. (Eds.), Series of Methods and Principles of Medicinal Chemistry – vol. 11, Wiley-VCH, New York, 2000.

  53. Martin, Y.C., Bures, M.G. and Brown, R.D., Validated descriptors for diversity measurements and optimization, Pharm. Pharmacol. Commun., 4 (1998) 147–152.

    CAS  Google Scholar 

  54. Martin, Y.C., Molecular Diversity: How we measure it? Has it lived up to its promise?, Il Farmaco 56 (2001) 137–139.

  55. Willett, P., Chemoinformatics – similarity and diversity in chemical libraries, Current Opinion in Biotechnology, 11 (2000) 85–88.

    Article  CAS  Google Scholar 

  56. Willett, P., Barnard, J.M. and Downs, G.M., Chemical similarity searching, J. Chem. Inf. Comput. Sci., 38 (1998) 983–996.

    Article  CAS  Google Scholar 

  57. Gillet, V., Willett, P. and Bradshaw, J., Similarity searching using reduced graphs, J. Chem. Inf. Comput. Sci., 43 (2003) 338–345.

    Article  CAS  Google Scholar 

  58. Randic, M., Molecular shape profiles, J. Chem. Inf. Comput. Sci., 35 (1995) 373–382.

    CAS  Google Scholar 

  59. Barnard, J.M., Substructure searching methods: Old and new, J. Chem. Inf. Comput. Sci., 33 (1993) 532–538.

    Article  CAS  Google Scholar 

  60. Mezey, P.G., The degree of similarity of three-dimensional bodies: Application to molecular shape analysis, J. Math. Chem., 7 (1991) 39–49.

    CAS  Google Scholar 

  61. Todeschini, R., Lasagni, R. and Marengo, E., New molecular descriptors descriptor for 2D and 3D structures. Theory, J. Chemometrics, 8 (1994) 263–272.

    Article  CAS  Google Scholar 

  62. Randic, M., Molecular profiles, novel geometry-dependent molecular descriptors, New J. Chem., 19 (1995) 781–791.

    CAS  Google Scholar 

  63. Ghuloum, A.M., Sage, C.R., Jain, A.N., Anwar, M.G., Carleton, R.S. and Ajay, N.J., Molecular hashkeys: A novel method for molecular characterization and its application for predicting important pharmaceutical properties of molecules, J. Med. Chem., 42 (1999) 1739–1748.

    Article  CAS  Google Scholar 

  64. Stiefl, N. and Baumann, K., Mapping property distributions of molecular surfaces: Algorithm and evaluation of a novel 3D quantitative structure-activity relationship technique, J. Med. Chem., 46 (2003) 1390–1407.

    Article  CAS  Google Scholar 

  65. Carbó, R., Leyda, L. and Arnau, M., An electron density measure of the similarity between two compounds, Int. J. Quantum Chemistry, 17 (1980) 1185–1189.

    Google Scholar 

  66. Kier, L.B. and Hall, L.H., An electrotopological-state index for atoms in molecules, Pharm. Res., 7 (1990) 801–807.

    Article  CAS  Google Scholar 

  67. Kubinyi, H. (Ed.) 3D QSAR in Drug Design: Theory, Methods and Applications, ESCOM Science Publishers B.V., Leiden, 1993.

    Google Scholar 

  68. Yao, J., Fan, B.T., Doucet, J.P., Panaye, A., Yuan, S. and Li, J., SIRSS-SS: A system for simulating IR/Raman spectra. 1. Substructure/subspectrum correlation, J. Chem. Inf. Comput. Sci., 41 (2001) 1046–1052.

    Article  CAS  Google Scholar 

  69. Panaye, A., Doucet, J.P. and Fan, B.T., Topological approach of C13-NMR spectral simulation: Application to fuzzy substructures, J. Chem. Inf. Comput. Sci., 33 (1993) 258–265.

    Article  CAS  Google Scholar 

  70. Davies, K. and Briant, C., Combinatorial chemistry library design using pharmacophore diversity, Network Science, (1995). Available at the following URL: http://www.netsci.org/Science/Combichem/feature05.html

  71. Faulon, J.-L., The signature Descriptor. 1. Using extended valence sequences in QSAR and QSPR studies, J. Chem. Inf. Comput. Sci., 43 (2003) 707–720.

    CAS  Google Scholar 

  72. Consonni, V., Todeschini, R. and Pavan, M., Structure/response correlation and similarity/diversity analysis by GETAWAY descriptors. 1. Theory of the novel 3D molecular descriptors, J. Chem. Inf. Comput. Sci., 42 (2002) 682–692.

    CAS  Google Scholar 

  73. Consonni, V., Todeschini, R. and Pavan, M., Structure/response correlation and similarity/diversity analysis by GETAWAY descriptors. 2. Application of the novel 3D molecular descriptors to QSAR/QSPR studies, J. Chem. Inf. Comput. Sci., 42 (2002) 693–705.

    CAS  Google Scholar 

  74. Jain, A.N., Morphological similarity: A 3D molecular similarity method correlated with protein-ligand recognition, J. Comput.-Aided Mol. Design, 14 (2000) 199–213.

    Article  CAS  Google Scholar 

  75. Todeschini, R. and Gramatica, P., 3D-modelling and prediction by WHIM descriptors. Part 5. Theory development and chemical meaning of WHIM descriptors, Quantum Struct.-Act. Relat., 16 (1997) 113–119.

    CAS  Google Scholar 

  76. Mason, J.S., Morize, I., Menard, P.R., Cheney, D.L., Hulme, C. and Labaudiniere, R.F., New 4-point pharmacophore method for molecular similarity and diversity applications: Overview of the method and applications, including a novel approach to the design of combinatorial libraries containing privileged substructures, J. Med. Chem., 42 (1999) 3251–3264.

    Article  CAS  Google Scholar 

  77. Walters, W.P., Stahl, M.T. and Murcko, M.A. Virtual screening – an overview, Drug Discovery Today, 3 (1998) 160–178.

    Article  CAS  Google Scholar 

  78. Patterson, D.E., Cramer, R.D., Ferguson, A.M., Clark, R.D. and Weinberger, L.E., Neighborhood behavior: A useful concept for validation of “Molecular diversity” descriptors, J. Med. Chem., 39 (1996) 3049–3059.

    Article  CAS  Google Scholar 

  79. Martin, Y.C., Kofron, J.L. and Traphagen, L.M., Do structurally similar molecules have similar biological activity? J. Med. Chem., 45 (2002) 4350–4358.

    CAS  Google Scholar 

  80. Doucet, J.P. and Panaye, A., 3D Structural information: Form property prediction to substructure recognition with neural networks, SAR and QSAR Envirom. Res., 8 (1998) 249–272.

    CAS  Google Scholar 

  81. Gund, P., Andose, J.D., Rhodes, J.B. and Smith G.M., Three-dimensional molecular modeling and drug design, Science, 208 (1980) 1425–1431.

    CAS  Google Scholar 

  82. Doucet, J.P. and Weber, J.K. (Eds.) Computer-Aided Molecular Design. Theory and Applications, Academic Press, London, 1996.

    Google Scholar 

  83. Pepperrell, C.A., Taylor, R. and Willett, P., Implementation and use of an atom-mapping procedure for similarity searching in databases of three-dimensional chemical structures, Tetrahedron Computer Methodology, 3 (1990) 55–63.

    Article  Google Scholar 

  84. Bajorath, J., Virtual Screening in drug discovery: Methods, expectations and reality. Available at the following URL: http://www.currentdrugdiscovery.com

  85. Turin, L. and Fumiko, Y., Structure-odor relations: A modern perspective. Available at the following URL: http://www.flexitral/research/review_final.pdf

  86. Meylan, W.M., Howard, P.H., Boethling, R.S., Aronson, D., Printup, H. and Gouchi, S., Improved methods for estimating bioconcentration/bioaccumulation factor from Octanol/Water partition coefficient, Environ. Toxicol. Chem., 18 (1999) 664–672.

    Article  CAS  Google Scholar 

  87. Gorse, D., Rees, A., Kaczorek, M. and Lahana, R., Molecular diversity and its analysis, Drug Disc.Today, 4 (1999) 257–264.

    CAS  Google Scholar 

  88. Japertas, P., Didziapetris, R. and Petrauskas, A., Fragmental Methods in the design of new compounds. Applications of the advanced algorithm builder, QSAR, 21 (2002) 23–37.

    CAS  Google Scholar 

  89. Cuissart, B., Touffet, F., Crémilleux, B., Bureau, R. and Rault, S., The maximum common substructure as a molecular depiction in a supervised classification context: Experiments in quantitative structure/biodegradability relationships, J. Chem. Inf. Comput. Sci., 42 (2002) 1043–1052.

    Article  CAS  Google Scholar 

  90. Gasteiger, J., Empirical approaches ao the calculation of properties. In Gasteiger, J. and Engel T. (Eds.), Chemoinformatics – A Textbook, Wiley-VCH, Germany, 2003. pp. 320–337.

    Google Scholar 

  91. Mannhold, R., Rekker, R.F., Sonntag, C., Ter Laak, A.M., Dross, K. and Polymeropoulos, E.E., Comparative evaluation of the predictive power of calculation procedures for molecular lipophilicity, J. Pharm. Sci., 84 (1995) 1410–1419.

    CAS  Google Scholar 

  92. Mannhold, R. and Van de Waterbeemd, Substructure and whole molecule approaches for calculating log P, J. Comput. Aided Mol. Des., 15 (2001) 337–354.

    Article  CAS  Google Scholar 

  93. Norinder, U., Osterberg, T. and Artusson, P., Theoretical calculation and prediction of Caco-2 cell permeability using MolSurf parametrization and PLS statistics, Pharm. Res., 14 (1997) 1786–1791.

    Article  CAS  Google Scholar 

  94. Norinder, U., Osterberg, T. and Artusson, P., Theoretical calculation and prediction of intestinal absorption of drugs using MolSurf parametrization and PLS statistics, Eur. J. Pharm. Sci., 8 (1999) 49–56.

    Article  CAS  Google Scholar 

  95. Palm, K., Stenberg, P., Luthman, K. and Artusson, P., Polar molecular surface properties predict the intestinal absorption of drugs in humans, Pharm. Res., 14 (1997) 568–571.

    Article  CAS  Google Scholar 

  96. Stenberg, P., Luthman, K. and Artursson, P., Prediction of membrane permeability to peptides from calculated dynamic molecular surface properties, Pharm. Res., 16 (1999) 205–212.

    CAS  Google Scholar 

  97. Stenberg, P., Norinder, U., Luthman, K. and Artursson, P., Experimental and computational screening models for the prediction of intestinal drug absorption, J. Med. Chem., 44 (2001) 1927–1937.

    Article  CAS  Google Scholar 

  98. Bergström, A.S., Computational and Experimental Models for the Prediction of Intestinal Drug Solubility and Absorption, Thesis book, Uppsala University, 2003.

  99. Gasteiger, J., Physicochemical effects in the representation of molecular structures for drug designing. Mini Rev. Med. Chem., 3, 789–796 (2003).

    Google Scholar 

  100. Torrens, F., Structural, chemical topological, electrotopological and electronic structure hypotheses, Comb. Chem. High Throughput Screening, 6 (2003) 801–809.

    CAS  Google Scholar 

  101. Wiswesser, W.J.A. (Ed.), A Line-Formula Chemical Notation, Crowell, New Tork, 1954.

    Google Scholar 

  102. Smith, E.G. (Ed.) Wiswesser Line-Formula Chemical Notation Method (WLN), Mc Graw Hill, New York, 1968, pp. 77.

    Google Scholar 

  103. Ash, S., Cline, M.A., Homer, R.W., Hurst, T. and Smith, G.B., SLN (SYBYL line notation), J. Chem. Inf. Comput. Sci., 37 (1997) 71–79.

    Article  CAS  Google Scholar 

  104. Weininger, D., Weininger, A. and Weininger, J.L., SMILES (Simplified Molecular Input Line Entry System), J. Chem. Inf. Comput. Sci., 29 (1989) 97–101. For more information see the URL: http://www.daylight.com/dayhtml/smiles

  105. Weininger, D., SMILES (Simplified Molecular Input Line Entry System), J. Chem. Inf. Comput. Sci., 28 (1988) 31–36.

    Article  CAS  Google Scholar 

  106. Vidal, D., Thormann, M. and Pons, M., LINGO, an efficient holographic text based method to calculate biophysical properties and intermolecular similarities, J. Chem. Inf. Model., 45 (2005) 386–393.

    Article  CAS  Google Scholar 

  107. Luque Ruiz, I., Cerruelo Garcia, G. and Gomez-Nieto, M.A., Representation of the molecular topology of cyclical structures by means of cycle graphs. 2. Applications to clustering of chemical databases, J. Chem. Inf. Comp. Sci., 44 (2004) 1383–1393.

    Google Scholar 

  108. Cuissart, B., Touffet, F., Cremilleux, B., Bureau, R. and Rault, S., The maximum common substructure as a molecular depiction in a supervised classification context: Experiments in quantitative structure/biodegradability relationships, J. Chem. Inf. Comput. Sci., 42 (2002) 1043–1052.

    Article  CAS  Google Scholar 

  109. Lesk, A.M., Detection of 3D patterns of atoms in chemical structures, Comm. ACM, 22 (1979) 219–224.

    Article  Google Scholar 

  110. Barrow, H.G. and Burstall, R.M., Subgraph isomorphism, matching relational structures and maximal cliques, Inf. Proc. Lett., 4 (1976) 83–84.

    Article  Google Scholar 

  111. Ullman, J.R., An algorithm for subgraph isomorphism, J. ACM., 23 (1976) 31–42.

    Article  Google Scholar 

  112. Jorgensen, A.M. and Pedersen, J.T., Structural diversity of small molecule libraries, J. Chem. Inf. Comput. Sci., 41 (2001) 338–345.

    Article  CAS  Google Scholar 

  113. Bron, C. and Kerbosh, J., Finding all cliques of an undirected graph, Commun. ACM, 16 (1973) 575–577. Available at the following URL: http://www.nap.edu/readingroom/books/mctcc/index.html

  114. Crandell, C.W. and Smith, D.H., Computer-assisted examination of compounds for common three-dimensional substructures, J. Chem. Inf. Comput. Sci., 23 (1983) 186–197.

    Article  CAS  Google Scholar 

  115. Ivanciuc, O., Taraviras, S.L. and Cabrol-Bass, D., Quasi-orthogonal basic sets of molecular graphs descriptors as a chemical diversity measure, J. Chem. Inf. Comput. Sci., 40 (2000) 126–134.

    CAS  Google Scholar 

  116. Randic, M. and Wilkins, C.L., Graph theoretical ordering of structures as a basis for systematic searches for regularities in molecular data, J. Phys. Chem., 83 (1979) 1525–1540.

    CAS  Google Scholar 

  117. Randic, M., Graph valence shells as molecular descriptors, J. Chem. Inf. Comput. Sci., 41 (2001) 627–630.

    CAS  Google Scholar 

  118. Takahashi, Y., Sukekawa, M. and Sasaki, S., Automatic identification of molecular similarity using reduced-graph representation of chemical structure, J. Chem. Inf. Comput. Sci., 32 (1992) 639–643.

    Article  CAS  Google Scholar 

  119. Gillet, V.J., Downs, G.M., Holliday, J.D., Lynch, M.F. and Dethlefsen, W., Computer Storage and retrieval of generic chemical structures in patents. 13. Reduced graph generation, J. Chem. Inf. Comput. Sci., 31 (1991) 260–270.

    Article  CAS  Google Scholar 

  120. Garey, M.G. and Johnson, D.S., Computers and intractability, a guide to the theory of NP-completeness, in Klee V. (Ed.), A series of books in the Mathematical Sciences, W.H. Freeman and company, New York, 1978, pp. 202–205.

  121. Aires-de-Sousa, J., Gasteiger, J., Gutman, I. and Vidovic, D., Chirality codes and molecular structure, J. Chem. Inf. Comput. Sci., 44 (2004) 831–836.

    Article  CAS  Google Scholar 

  122. Petitjean, M., Chirality and symmetry measures: A transdisciplinary review, Entropy, 5 (2003) 271–312. Available at the following URL: http://www.mdpi.net/entropy

  123. Fan, B.T., Panaye, A., Yao, J.H., Yuan, S.G. and Doucet, J.P., Geometric symmetry and chemical equivalence, in Hansew, P., Fowler, P. and Zheng, M. (Eds.), Discrete Mathematical Chemistry (Proceedings of the DIMACS Workshop), Rutgers University, March 23–24, Discrete Mathematical Society, USA, 2000, pp. 129–139.

  124. Buda, A.B., Auf der Heyde, T. and Mislow, K., On quantifying chirality, Angew. Chem. Int. Ed. English, 31 (1992) 989–1007.

    Article  Google Scholar 

  125. Buda, A.B. and Mislow, K., A Hausdorff Chirality Measure, J. Am. Chem. Soc., 114 (1992) 6006–6012.

    Article  CAS  Google Scholar 

  126. Avnir, D., Katzenelson, O., Keinan, S., Pinsky, M., Pinto, Y., Salomon, Y. and Zabrodsky Hel-Or, H., The measurement of symmetry and chirality: Conceptual aspects, in Rouvray D.H. (Ed). Concepts in Chemistry. A Contemporary Challenge. Chap. 9, University of Georgia, Research Studies Press Ltd. Taunton, Wiley & Sons, New York, 1996, pp. 283–324.

  127. Avnir, D., Zabrodsky Hel-Or, H. and Mezey, P.G., Symmetry and chirality: Continuous measures, In Raqué Schleyer P.V. (Ed.), Encyclopedia of Computational Chemistry. Vol 4, Wiley & Sons, Chichester, 1998, pp. 2890–2901.

  128. Mezey, P.G., Generalized chirality and symmetry deficiency, J. Math. Chem., 23 (1998) 65–84.

    CAS  Google Scholar 

  129. Kuz'min, V.E., Stel'makh, I.B., Bekker, M.B. and Pozigun, D.V., Quantitative aspects of chirality. II. Analysis of dissymetry function behaviour with different changes in the structure of the model systems, J. Phys. Org. Chem., 5 (1992) 299–307.

    Google Scholar 

  130. Kuz'min, V.E., Stel'makh, I.B., Yudanova, I.V., Pozigun, D.V. and Bekker, M.B., Quantitative aspects of chirality. I. Method of dissymetry function, J. Phys. Org. Chem., 5 (1992) 295–298.

    Google Scholar 

  131. Dubois, J.E., Mercier, C. and Panaye, A., DARC topological system and computer aided design, Acta Pharm. Jugosl., 36 (1986) 135–169.

    CAS  Google Scholar 

  132. Dubois, J.E., Doucet, J.P., Panaye, A. and Fan, B.T., DARC site toplogical correlations: Ordered structural descriptors and property evaluation. In Devillers, J. and Balaban, T. (Eds). Topological indices and related descriptors in QSAR and QSPR, Gordon and Breach Sciences Publishers, Amsterdam, 1999, pp. 613–673.

    Google Scholar 

  133. Handbook of CIDS chemical search keys, Fein-Marquart Assoc. Inc. Towson, Baltimore, MD., 1973.

  134. Bremser, W., Horse – A novel substructure code, Anal. Chem. Acta., 103 (1978) 355–365.

    Article  CAS  Google Scholar 

  135. Hull, R.D., Singh, S.B., Nachbar, R.B., Sheridan, R.P., Kearsley, S.K. and Fluder, E.M., Latent Semantic Structure Indexing (LaSSI) for defining chemical similarity, J. Med. Chem., 44 (2001) 1177–1184.

    CAS  Google Scholar 

  136. Xiao, Y., Qiao, Y., Zhang, J., Lin, S. and Zhang, W., A method for substructure search by atom-centered multilayer code, J. Chem. Inf. Comput. Sci., 37 (1997) 701–704.

    Article  CAS  Google Scholar 

  137. Bender, A., Mussa, H.Y. and Glen, R.C., Molecular Similarity searching using atoms environments, information-based feature selection and a naïve Bayesian classifier, J. Chem. Inf. Comput. Sci. 44 (2004) 170–178.

    Google Scholar 

  138. Xing, L. and Glen, R.C., Novel methods for the prediction of Log P, pKa and Log D, J. Chem. Inf. Comput. Sci., 42 (2002) 796–805.

    Article  CAS  Google Scholar 

  139. Faulon, J.L., Visco, D.P. Jr. and Pophale, R.S., The signature molecular descriptor. 1. Using extended valence sequences in QSAR and QSPR studies, J. Chem. Inf. Comput. Sci., 43 (2003) 707–720.

    CAS  Google Scholar 

  140. Faulon, J.L., Churchwell, C.J. and Visco, D.P. Jr., The signature Molecular Descriptor. 2. Enumerating molecules from their extended valence sequences, J. Chem. Inf. Comput. Sci., 43 (2003) 721–734.

    CAS  Google Scholar 

  141. Mitchell, T.M. (Ed.) Machine Learning, McGraw-Hill, New York, 1997.

    Google Scholar 

  142. Robinson, D.D., Barlow, T.W. and Richards, W.G., The utilization of reduced dimensional representation of molecular structure for rapid molecular similarity calculations, J. Chem. Inf. Comput. Sci., 37 (1997) 943–950.

    CAS  Google Scholar 

  143. Carbó, R., Leyda, L. and Arnau, M, An electron density measure of the similarity between two compounds, Int. J. Quantum Chem., 17 (1980) 1185–1189.

    Google Scholar 

  144. Hogking, E.E. and Richards, W.G., Molecular similarity based on electrostatic potential and electric field, Int. J. Quantum Chem. Quantum Biol. Symp., 14 (1987) 105–117.

    Google Scholar 

  145. Cramer, R.D., Patterson, D.E. and Bunce, J.D., Comparative molecular field analysis (CoMFA). Effect of shape on binding of steroids to carrier proteins, J. Am. Chem. Soc., 110 (1988) 5959–5967.

    Article  CAS  Google Scholar 

  146. Good, A.C., Sung-Sau, S. and Richards, W.G., Structure activity relationships from molecular similarity matrices, J. Med. Chem., 36 (1993) 433–438.

    CAS  Google Scholar 

  147. Pearson, K., Mathematical contributions to the theory of evolution III. Regression, heredity, and panmixia, Philos. Trans. Royal Soc., 187 (1896) 253–318.

    Google Scholar 

  148. Pearson, K., On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling (1900), In Karl Pearson's Early Statistical Papers, Cambridge University Press, London, 1956, pp. 339–357.

  149. Klebe, G., Structural alignment of molecules, In Kubinyi H. (Ed.), 3D-QSAR in Drug Design: Theory, Methods and Applications, ESCOM, Leiden, 1993, pp 173–199.

  150. Lemmen, C. and Lengauer, T., Computational methods for the structural alignment of molecules, J. Comput.-Aided Mol. Des., 14 (2000) 215–232.

    Article  CAS  Google Scholar 

  151. Petitjean, M., From shape similarity to shape complementarity: Toward a docking theory, J. Math. Chem., 35 (2004) 147–158.

    Article  CAS  Google Scholar 

  152. Grant, J.A. and Pickup, B.T., A Gaussian description of molecular shape, J. Phys. Chem., 99 (1999) 3503–3510.

    Google Scholar 

  153. Putta, S., Lemmen, C., Beroza, P. and Greene, J., A novel shape-feature based approach to virtual library screening, J. Chem. Inf. Comput. Sci., 42 (2002) 1230–1240.

    Article  CAS  Google Scholar 

  154. Hahn, M., Three-dimensional shape-based searching of conformationally flexible compounds, J. Chem. Inf. Comput. Sci., 37 (1997) 80–86.

    Article  CAS  Google Scholar 

  155. Putta, S., Eksterowicz, J., Lemmen, C. and Stanton, R., A novel subshape molecular descriptor, J. Chem. Inf. Comput. Sci., 43 (2003) 1623–1635.

    Article  CAS  Google Scholar 

  156. Semus, S.F., CoMFA: A field of dreams?, Network Science (1996). Available at the following URL: http://www.netsci.org/Science/Compchem/feature11.html

  157. Calder, J.A., CoMFA validation of the superposition of six classes of compounds which block GABA receptors non-competitively, J. Comput.-Aided Mol. Des. 7 (1993) 45–60.

    Google Scholar 

  158. Horwitz, J.P., Comparative molecular field analysis of in vitro growth inhibition of L1210 and HCT-8 cells by some pyrazoloacridines, J. Med. Chem. 36 (1993) 3511–3516.

    Google Scholar 

  159. Klebe, G. and Abraham, U., On the prediction of binding properties of drug molecules by comparative molecular field analysis, J. Med. Chem., 36 (1993) 70–80.

    Article  CAS  Google Scholar 

  160. Connolly, M.L, Molecular Surfaces: A Review, Network Science (1996). Available at the following URL: http://www.netsci.org/Science/Compchem/feature14.html

  161. Chau, P.L. and Dean, P.M., Molecular recoginition: 3D surface structure comparison by gnomic projection, J. Mol. Graph., 5 (1987) 97–100.

    CAS  Google Scholar 

  162. Mount, J., Ruppert, J., Welch, W. and Jain, A.N., IcePick: A flexible surface-based system for molecular diversity, J. Med. Chem., 42 (1999) 60–66.

    Article  CAS  Google Scholar 

  163. Rusinko III, A., Sheridan, R.P., Nilakantan, P., Haraki, K.S., Bauman, N. and Venkataraghavan, R., Using CONCORD to construct a large database of three-dimensional coordinates from connection tables, J. Chem. Inf. Comput. Sci., 29 (1989) 251–267.

    Google Scholar 

  164. Sadowski, J., Wagener, M. and Gasteiger, J., CORINA: Automatic generation of high-quality 3D-molecular models for application in QSAR, in Sanz, F., Giraldo, J. and Manaut F. (Eds.), QSAR and Molecular Modelling: Concepts, Computational Tools and Biological Applications. Prous Science Publishers, 1995, pp. 646–651.

  165. Von Neumann, J. (Ed.) Mathematical Foundations of Quantum Mechanics, Princeton University Press, New Jersey, 1955.

    Google Scholar 

  166. Born, M. (Ed.) Atomic Physics, Blackie and Son Press, London, 1945.

    Google Scholar 

  167. Dirac, P.A.M., The Principles of Quantum Mechanics, Clarendon Press, Oxford, 1983.

    Google Scholar 

  168. Carbó-Dorca, R., Arnau, J. and Leyda, L., How similar is a molecule to another? An electron density measure of similarity between two molecular structures, Int. J. Quantum Chem., 17 (1980) 1185–1189.

    Google Scholar 

  169. Carbó-Dorca, R., Martin, M. and Pons, V., Applications of quantum mechanical parameters in quantitative structure-activity relationships, Afinidad, 34 (1977) 348–353.

    Google Scholar 

  170. Eyring, H., Walter, J. and Kimball, G.E. (Eds.) Quantum Chemistry, Wiley & Sons, New York, 1944.

    Google Scholar 

  171. Carbó-Dorca, R., Calabuig, B., Vera, L. and Besalú, E., Molecular quantum similarity: Theoretical framework, ordering principles and visualization techniques, Adv. Quantum Chem., 25 (1994) 253–313.

    Google Scholar 

  172. Carbó-Dorca, R., Robert, D., Amat, L.I., Gironés, X. and Besalú, E., Molecular quantum similarity in QSAR and drug design, Lecture Notes in Chemistry, 2000, 73, Springer-Verlag, Berlin.

  173. Carbó-Dorca, R., Quantum quantitative structure-activity relationships (QQSPR): A comprehensive discuccion based on inward matrix products, employed as a tool to find approximate solutions of strictly positive lineat systems and providing QSAR-quantum similarity measures connection, (Proceedings of the European Congress on Computational Methods in Applied Sciences and Engineering (ECCOMAS 2000)), Barcelona, Spain, September 11–14, CDROM, ISBN 84-89925-70-4, CIMNE, Barcelona, 2000.

  174. Janesko, B.G. and Yaron, D., Using molecular similarity to construct accurate semiempirical electronic structure theories, J. Chem. Phys., 121 (2004) 5635–5645.

    Article  CAS  Google Scholar 

  175. Xian, B., Li, T., Sun, G. and Cao, T., The combination of principal component analysis, genetic algorithm and tabu search in 3D molecular similarity, J. Molec. Struct. (Theochem) 674 (2004) 87–97.

    Google Scholar 

  176. Davies, E.K. and Briant C., Combinatorial chemistry library design using pharmacophore diversity, Network Science. 1995. Available at the following URL: http://www.netsci.org/Science/Combichem/feature05.html

  177. The Unity software packages are available from Tripos Inc at URL: http://www.tripos.com/

  178. Godden, J.W., Furr, J.R., Xue, L., Stahura, F.L. and Bajorath, J., Molecular similarity analysis and virtual screening by mapping of consensus positions in binary-transformated chemical descriptor spaces with variable dimensionality, J. Chem. Inf. Comput. Sci., 44 (2004) 21–29.

    Article  CAS  Google Scholar 

  179. MACCS keys, BCI fingerprints and MDL keys information available at: http://www.mesaac.com/Fingerprint.htm, http://www.bci.gb.com/products/fingerprints.htm and http://www.daylight.com/dayhtml/doc/theory/theory.finger.html, respectively.

  180. Arnold, J.R., Burdick, K.W., Pegg, S.C., Toba, S., Lamb, M.L. and Kuntz, I.D., SitePrint: Three-dimensional pharmacophore descriptors derived from protein binding sites for family based active site analysis, classification and drug design. J. Chem. Inf. Comput. Sci., 44 (2004) 2190–2198.

    Article  CAS  Google Scholar 

  181. Horvart, D. and Mao, B., Neighborhood behavior. Fuzzy molecular descriptors and their influence on the relationships between structural similarity and property similarity, QSAR Comb. Sci., 22 (2003) 498–509.

    Google Scholar 

  182. Jenkins, J.L., Glick M. and Davies, J.W., A 3D similarity method for scaffold hopping from know drugs or natural ligands to new chemotypes, J. Med. Chem., 47 (2004) 6144–6159.

    Article  CAS  Google Scholar 

  183. Renner, S. and Schneider, G., Fuzzy pharmacophore models from molecular alignements for correlation-vector-based virtual screening, J. Med. Chem., 47 (2004) 4653–4664.

    Article  CAS  Google Scholar 

  184. Rhodes, N. and Willett, P., CLIP: Similarity searching of 3D databases using clique detection, J. Chem. Inf. Comput. Sci., 43 (2003) 443–448.

    Article  CAS  Google Scholar 

  185. Todeschini, R. and Consonni, V., Dragon, release 1.12 for Windows, Milano, Italy, 2001. For more information see the URL: http://www.disat.unimib.it/chm/Dragon.htm

  186. Selwood, D.L., Livingstone, D.J., Comley, J.C.W., O'Dowd, A.B., Hudson, A.T., Jackson, P., Jandu, K.S., Rose, V.S. and Stables J.N., Structure-activity relationships of antifilarial antimycin analogues, a multivariate pattern recognition study, J. Med. Chem., 33 (1990) 136–142.

    Article  CAS  Google Scholar 

  187. Zheng, W. and Tropsha, A., Novel variable selection quantitative structure-property relationship approach based on the k-nearest neighbour principle, J. Chem. Inf. Comput. Sci., 40 (2000) 185–194.

    Article  CAS  Google Scholar 

  188. Sutter, J.M., Dixon, S.L. and Jurs, P.C., Automated descriptor selection for quantitative structure-activity relationships using generalised simulated annealing, J. Chem. Inf. Comput. Sci., 35 (1995) 77–84.

    Article  CAS  Google Scholar 

  189. Kubinyi, H., Variable selection in QSAR studies. I. An evolutionary algorithm, QSAR, 13 (1994) 285–294.

    CAS  Google Scholar 

  190. Luke, B.T., Evolutionary programming applied to the development of quantitative structure-activity relationships and quantitative structure-property relationships, J. Chem. Inf. Comput. Sci., 34 (1994) 1279–1287.

    Article  CAS  Google Scholar 

  191. Waller, C.L. and Bradley, M.P., Development and validation of a novel variable selection technique with application to multidimensional quantitative structure-activity relationship studies, J. Chem. Inf. Comput. Sci., 39 (1999) 345–355.

    Article  CAS  Google Scholar 

  192. Hasegawa, K. and Funatsu, K., Genetic algorithm strategy for variable selection in QSAR studies. GAPLS and D-optimal design for predictive QSAR studies, J. Mol. Struct. (Theochem), 425 (1998) 255–262.

    Article  CAS  Google Scholar 

  193. Jouan-Rimbaud, D., Massart, D.L. and De Noord, O.E., Random correlations in variable selection for multivariate calibration with a genetic algorithm, Chemom. Intell. Lab. Syst., 35 (1996) 213–220.

    Article  CAS  Google Scholar 

  194. Rogers, D.R. and Hopfinger, A.J., Application of genetic function approximation to quantitative structure-activity relationships and quantitative structure-property relationships, J. Chem. Inf. Comput. Sci., 34 (1994) 854–866.

    CAS  Google Scholar 

  195. Nath, R., Rajagopalan, B. and Ryker, R., Determining the saliency of input variables in neural networks classifiers, Comput. Ops. Res., 24 (1997) 767–773.

    Google Scholar 

  196. Koivalishyn, V., Tetko, V.I., Luik, A.I., Kholodovych, V.V., Villa, A.E.P. and Livingstone, D.J., Neural networks studies. 3. Variable selection in the cascade-correlation learning architecture, J. Chem. Inf. Comput. Sci., 38 (1998) 651–659.

    Google Scholar 

  197. Todeschini, R., Galvagni, D., Vilchez, J.L., Del Olmo, M. and Navas, N., Kohonen artificial neural networks as a tool for wawelength selection in multicomponent spectrofluorimetric PLS modeling: application to phenol, o-cresol, m-cresol and p-cresol mixtures, Trends Anal. Chem., 18 (1999) 93–98.

    Article  CAS  Google Scholar 

  198. Burden, F.D., Ford, M.G., Whitley, D.C. and Winkler, D.A., Use of automatic relevance determination in QSAR studies using Bayesian neural networks, J. Chem. Inf. Comput. Sci., 40 (2000) 1423–1430.

    Article  CAS  Google Scholar 

  199. Agrafiotis, D.K. and Cedeno, W., Feature selection for structure-activity correlation using binary particle swarms., J. Med. Chem., 45 (2002) 1098–1107.

    Google Scholar 

  200. Izrailev, S. and Agrafiotis, D.K., A novel method for building regression tree models for QSAR based on artificial ant colony systems, J. Chem. Inf. Comput. Sci., 41 (2001) 176–180.

    Google Scholar 

  201. Izrailev, S. and Agrafiotis, D.K., Variable selection for QSAR by artificial ant colony systems, SAR QSAR Environ. Res., 13 (2002) 417–423.

    CAS  Google Scholar 

  202. Tetko, I.V., Villa, A.E. and Livingstone, D.J., Neural network studies. 2. Variable selection, J. Chem. Inf. Comput. Sci., 36 (1996) 794–803.

    Article  CAS  Google Scholar 

  203. Böcker, A., Schneider, G. and Teckentrup, A., Status of HTS data mining approaches, QSAR Comb. Sci., 23 (2004) 207–213.

    Google Scholar 

  204. Bayada, D.M., Hamersma, H. and Van Geerestein, V.J., Molecular diversity and representativity in chemical databases, J. Chem. Inf. Comput. Sci., 39 (1999) 1–10.

    Article  CAS  Google Scholar 

  205. Piclin, N., Screening virtuel de grandes bases de données: validation de méthodes et application en chimie pharmaceutique et en toxicité, Thesis book, Université d'Orleans, 2002.

  206. Haykin, S., Neural Networks: A Comprenhensive Foundation, Prentice-Hall, 1999.

  207. Czerminski, R., Yasri, A. and Hartsough, D., Use of support vector machine in pattern classification: Application to QSAR studies, Quant. Struct.-Act. Relat., 20 (2001) 345–351.

    Article  Google Scholar 

  208. Li, Q., Yao, X., Chen, X., Liu, M., Zhang, R., Zhan, X. and Hu, Z., Application of artificial neural networks for the simultaneous determination of a mixture of fluorescent dyes by synchronous fluorescence, The Analyst, 125 (2000) 2049–2053.

    CAS  Google Scholar 

  209. Agrafiotis, D.K., Cedeño, W. and Lobanov, V.S., On the use of neural networks in QSAR and QSPR, J. Chem. Inf. Comput. Sci., 42 (2002) 903–911.

    CAS  Google Scholar 

  210. Murcia-Soler, M., Pérez-Giménez, F., Garcia-March, F.J., Salabert-Salvador, M.T., Dias-Villanueva, W. and Castro-Bleda, M.J., Drugs and nondrugs: An effective discrimination with topological methods and artificial neural networks, J. Chem. Inf. Comput. Sci., 43 (2003) 1688–1702.

    Article  CAS  Google Scholar 

  211. Ma, Q.L., Yan, A.X., Hu, Z.D., Li, Z.X. and Fan, B.T., Principal component analysis and artificial neural networks applied to the classification of Chinese pottery of neolithic age, Analy. Chim. Acta., 406 (2000) 247–256.

    CAS  Google Scholar 

  212. Gasteiger, J., Teckentrup, A., Terfloth, L. and Spycher, S., Neural networks as data mining tools in drug design, J. Phys. Org. Chem., 16 (2003) 232–245.

    Article  CAS  Google Scholar 

  213. Terfloth, L. and Gasteiger, J., Self-organizing neural networks in drug design, Screening – Trends in Drug Discovery, 2 (2001) 49–51.

    Google Scholar 

  214. Zupan, J. and Gasteiger, J., Neural Networks in Chemistry and Drug Design, Second Edition. Wiley-VCH Publishers, Weinheim, 1999.

    Google Scholar 

  215. Zupan, J. and Gasteiger, J. (Eds.) Neural Networks for Chemists: An Introduction, VCH-Verlag, Weinheim, 1993.

    Google Scholar 

  216. Dreiseitl, S. and Ohno-Machado, L., Logistic regression and artificial neural network classification models: A methodology review, J. Biomedical Inform., 35 (2002) 352–359.

    Google Scholar 

  217. Manallack, D.T. and Livingstone, D.J., Neural Networks in drug discovery: Have they lived up their promise?, Eur. J. Med. Chem., 34 (1999) 195–208.

    Article  CAS  Google Scholar 

  218. Niculescu, S.P., Artificial neural network and genetic algorithm in QSAR, J. Mol. Struc. (Theochem.), 622 (2003) 71–83.

    CAS  Google Scholar 

  219. Smits, J.R.M., Melssen, W.J., Buydens, L.M.C. and Kateman, G., Using artificial neural networks for solving chemical problems. Part I. Multi-layer feed-forward networks, Chemom. Intel. Lab. Syst., 23 (1994) 165–189.

    Google Scholar 

  220. Melssen, W.J., Smits, J.R.M., Buydens, L.M.C. and Kateman, G., Using artificial neural networks for solving chemical problems. Part II. Kohonen Self-organising feature maps and Hopfield networks, Chemom. Intel. Lab. Syst., 23 (1994) 267–291.

    CAS  Google Scholar 

  221. Richards, J. and Jia, X., Remote Sensing Digital Image Analysis – An introduction, Springer, Third Ed., New York, 2000.

  222. Ho, P., Silva, M.C. and Hogg, T.A., Multiple imputation and maximum likelihood principal component analysis of incomplete multivariate data from a study of the ageing of port, Chemom. Intell. Lab. Syst. 55 (2001) 1–11.

  223. Andrews, T.D. and Wentzell, P., Applications of maximum likelihood principal component analysis: Incomplete data sets and calibration transfer, Analytica Chimica Acta, 350 (1997) 341–352.

    Article  CAS  Google Scholar 

  224. Pereira, J.L., Pais, A.C. and Redinha, J.S. Maximum likelihood estimation with nonlinear regression in polarographic and potentiometric studies, Analytica Chimica Acta, 433 (2001) 135–143.

    Article  CAS  Google Scholar 

  225. Verdonck, F., Jaworskab, J., Thasa, O. and Vanrolleghema, P.A., Determining environmental standards using bootstrapping, bayesian and maximum likelihood techniques: A comparative study, Analytica Chimica Acta, 446 (2001) 427–436.

    Article  Google Scholar 

  226. Kuttatharmmakul, S., Smeyers-Verbeke, J. and Noack, D.L., The mean and standard deviation of data, some of which are below the detection limit: An introduction to maximum likelihood estimation, TrAC, Trends in Analytical Chemistry, 19 (2000) 215–222.

    CAS  Google Scholar 

  227. Wentzell, P. and Lohnes, M.T., Maximum likelihood principal component analysis with correlated measurement errors: Theoretical and practical considerations, Chemometrics and Intelligent Laboratory Systems, 45 (1999).

  228. Cortes, C. and Vapnik, V., Support-vector networks, Machine Learning, 20 (1995) 273–297.

    Google Scholar 

  229. Vapnik, V., The nature of Statistical Learning Theory, Springer, Berlin, 1995.

    Google Scholar 

  230. Burbidge, R., Trotter, M., Buxton, B. and Holden, S., Drug design by machine learning: Support vector machines for pharmaceutical data analysis, Comput. Chem., 26 (2001) 5–14.

    Article  CAS  Google Scholar 

  231. Warmuth, M.K., Liao, J., Ratsch, G., Mathieson, M., Putta, S. and Lemmen, C., Active learning with support vector machines in the drug discovery process, J. Chem. Inf. Comput. Sci., 43 (2003) 667–673.

    Article  CAS  Google Scholar 

  232. Wilton, D., Willett, P., Lawson, K. and Mullier, G., Comparison of ranking methods for virtual screening in lead discovery programs, J. Chem. Inf. Comput. Sci., 43 (2003) 469–474.

    Article  CAS  Google Scholar 

  233. Zernov, V.V., Balakin, K.V., Ivanschzenko, A.A., Savchuk, N.P. and Pletnev, I.V., Drug discovery using support vector machines. The case studies of drug-likeness, agrochemical-likeness and enzyme inhibition predictions, J. Chem. Inf. Comput. Sci., 43 (2003) 2048–2056.

    Article  CAS  Google Scholar 

  234. Norinder, U., Support vector machine models in drug design: Applications to drug transport processes and QSAR using simple optimization and variable selection, Neurocomputing, 55 (2003) 337–346.

    Article  Google Scholar 

  235. Byvatov, E., Fechner, U., Sadowski, J. and Schneider, G., Comparison of support vector machine and artificial neural networks systems for drug/nondrug classification, J. Chem. Inf. Comput. Sci., 43 (2003) 1882–1889.

    Article  CAS  Google Scholar 

  236. Teckentrup, A., Briem H. and Gasteiger, J., Mining High-Throughput screening data of combinatorial libraries: Development of a filter to distinguish hits from non hits, J. Chem. Inf. Comput. Sci., 44 (2004) 626–634.

    Article  CAS  Google Scholar 

  237. Liu, H.X., Zhang, R.S., Yao, X.J., Liu, M.C., Hu, Z.D. and Fan, B.T., QSAR and classification models of a novel series of COX-2 selective inhibitors: 1, 5-Diarylimidazoles based on support vector machines, Journal of Computer-Aided Molecular Design, 18 (2004) 389–399.

    Google Scholar 

  238. Liu, H.X., Zhang, R.S., Luan, F., Yao, X.J., Liu, M.C., Hu, Z.D. and Fan, B.T., Diagnosing breast cancer based on support vector machines, J. Chem. Inf. Comput. Sci., 43 (2003) 900–907.

    CAS  Google Scholar 

  239. Liu, H.X., Zhang, R.S., Yao, X.J., Liu, M.C., Hu, Z.D. and Fan, B.T., QSAR study of ethyl 2-[3-methyl-2,5-dioxo(3-pyrrolinyl)amino]-4-(trifluoromethyl)pyrimidine-5-carboxylate: An inhibitor of AP-1 and NF-kB mediated gene expression based on support vector machines, J. Chem. Inf. Comput. Sci., 43 (2003) 1288–1296.

    CAS  Google Scholar 

  240. Yao, X., Zhang, R.S., Chen, H., Doucet, J.P., Panaye, A., Fan, B.T., Liu, M. and Hu, Z., Comparative classification Study of Toxicity Mechanisms Using Support Vector Machines and Radial Basis Function Neural Networks, Anal. Chim. Acta, in press.

  241. Xue, C.X., Zhang, R.S., Liu, H.X., Yao, X.J., Liu, M.C., Hu, Z.D. and Fan, B.T., QSAR models for the prediction of binding affinities to human serum albumin using the heuristic method and a support vector machine, J. Chem. Inf. Comput. Sci., 44 (2004) 1693–1700.

    CAS  Google Scholar 

  242. Zhao, C.Y., Zhang, R.S., Liu, H.X., Xue, C.X., Zhao, S.G., Zhou, X.F., Liu, M.C. and Fan, B.T., Diagnosing anorexia based on partial least squares, back propagation neural network, and support vector machines, J. Chem. Inf. Comput. Sci., 44 (2004) 2040–2046.

    Article  CAS  Google Scholar 

  243. Fix, E. and Hodges, J., Discriminatory analysis. Nonparametric discrimination: Consistency properties, Technical Report 4, USAF School of Aviation Medicine, Texas, 1951.

  244. Beckonert, O., Bollard, M.E., Ebbels, T., Keun, H., Antti, H., Holmes, E., Lindon, J.C. and Nicholson, J.K., NMR-based metabonomic toxicity classification: Hierarchical cluster analysis and k-nearest-neighbour approaches, Analytica Chimica Acta, 490 (2003) 3–15.

    Article  CAS  Google Scholar 

  245. O'Farrell, M., Lewis, E., Flanagan, C., Lyons, W. and Jackman, N., Comparison of k-NN and neural network methods in the classification of spectral data from an optical fibre-based sensor system used for quality control in the food industry, Sensors and Actuators B: Chemical, In Press, Corrected Proof, 2005.

  246. Amendolia, S.R., Cossu, G., Ganadu, M.L., Golosio, B., Masala, G.L. and Mura, G.M., A comparative study of k-nearest neighbour, support vector machine and multi-layer perceptron for thalassemia screening, Chemometrics and Intelligent Laboratory Systems, 69 (2003) 13–20.

    Article  CAS  Google Scholar 

  247. Jarvis, R.A. and Patrick, E.A., The Jarvis-Patrick algorithm – clustering using a similarity measure based on nearest neighbors, IEEE Trans. Comput. 22 (1973) 1025–1034.

    Google Scholar 

  248. Ward, J.H., Hierarchical grouping to optimize an objective function, J. Am. Stat. Assoc., 58 (1963) 236–244.

    Google Scholar 

  249. Barnard, J.M. and Downs, G.M., Clustering on chemical structures on the basis of two-dimensional similarity measures, J. Chem. Inf. Comput. Sci., 32 (1992) 664–649.

    Article  Google Scholar 

  250. Matter, H., Selecting optimally diverse compounds from structure databases: A validation study of two-dimensional and three-dimensional molecular descriptors, J. Med. Chem., 40 (1997) 1219–1229.

    Article  CAS  Google Scholar 

  251. Gooden, J.W. and Bajorath, J., Cel-based partitioning, In Bajorath, J. (Ed.), Methods in Molecular Biology, vol. 275. Chemoinformatics. Concepts, Methods and Tools for Drug Discovery. Humana Press Inc., Totowa, NJ. 2004, pp. 291–300.

  252. Kohonen, T., Self-Organization and Associative Memory, Springer-Verlag, 1989.

  253. Bishop, C.M. and Tipping, M.E., Latent variable models and data visualization, In Kay, J.W. and Titterington, D.M. (Eds.), Statistics and Neural Networks, Oxford University Press, London, 1999, pp. 141–164.

    Google Scholar 

  254. Scholkopf, B., Smola, A.J. and Muller, K.R., Kernel principal component analysis, Available at the following URL: http://mlg.anu.edu.au/~smola/papers/SchSmoMul99.pdf

  255. Schölkopf, B., Burges, C.J.C. and Smola, A.J. (Eds.) Advances in Kernel Methods, MIT Press, Cambridge, 1999, pp. 327–352.

    Google Scholar 

  256. Taboureau, O., BioInformatique et drug design: contribution à l'exploitation de grandes bases de données chimiques, Thesis book, Université d'Orleans, 2001.

  257. Spycher, S., Nendza, M. and Gasteiger, J., Comparison of different classification methods applied to a mode of toxic action data set, QSAR Comb. Sci., 23 (2004) 779–791.

    Article  CAS  Google Scholar 

  258. Feher, M. and Schmidt, J.M., Property Distributions: differences between drugs, natural products and molecules from combinatorial chemistry, J. Chem. Inf. Comput. Sci., 43 (2003) 218–227.

    CAS  Google Scholar 

  259. Lanctot, J.K., Putta, S., Lemmen, C. and Greene, J., Using ensembles to classify compounds for drug discovery, J. Chem. Inf. Comput. Sci., 43 (2003) 2163–2169.

    Article  CAS  Google Scholar 

  260. Clark, R.D., OptiSim: An extended dissimilarity selection method for finding diverse representative subsets, J. Chem. Inf. Comput. Sci., 37 (1997) 1181–1188.

    CAS  Google Scholar 

  261. Lobanov, V.S. and Agrafiotis, D.K., Stochastic similarity selections from large combinatorial libraries, J. Chem. Inf. Comp. Sci., 40 (2000) 460–470.

    Article  CAS  Google Scholar 

  262. Lin, S.-K., Molecular diversity assessment: Logarithmic relations of information and species diversity and logarithmic relations of entropy and indistinguishability after rejection of Gibbs paradox of entropy of mixing, Molecules, 1 (1996) 57–67.

    CAS  Google Scholar 

  263. Agrafiotis, D.K., On the use of information theory for assessing molecular diversity, J. Chem. Inf. Comp. Sci., 37 (1997) 576–580.

    CAS  Google Scholar 

  264. Dudoit, S., Fridlyand, J. and Speed, T.P., Comparison of discrimination methods for the classification of tumors using gene expression data, JASA, 97 (2002) No. 457.

  265. Lajiness M.S., Dissimilarity-based compound selection techniques, Persp. Drug Discuss. Design, 7/8 (1997) 65–84.

    CAS  Google Scholar 

  266. Mason, J.S. and Picket S.D., Partition-based selection, Perspect. Drug Disc. Design, 7/8 (1997) 85–114.

    CAS  Google Scholar 

  267. Godden, J.W., Xue, L., Kitchen, D.B., Stahura, F.L., Schermerhorn, E.J. and Bajorath, J., Median partitioning: A novel method for the selection of representative subsets from large compound pools, J. Chem. Inf. Comput. Sci. 42 (2002) 885–893.

    CAS  Google Scholar 

  268. Godden, J.W., Xue, L. and Bajorath, J., Classification of biologically active compounds by median partitioning, J. Chem. Inf. Comput. Sci., 42 (2002) 1263–1269.

    CAS  Google Scholar 

  269. Shanmugasundaran, V., Maggiora, G.M. and Lajiness, M.S., Hit-directed nearest-neighbor searching, J. Med. Chem., 48 (2005) 240–248.

    Google Scholar 

  270. Cramer, R.D., Jilek, R.J., Guessregen, S., Clark, S.J., Wendt, B. and Clark, R.D. “Lead hoping”. Validation of topomer similarity as a superior predictor of similar biological activities, J. Med. Chem., 47 (2004) 6777–6791.

    Article  CAS  Google Scholar 

  271. Bernard, P., Modélisation et diversité moléculaires des inhibiteurs de l'acétylcholinestérase, Thesis book, Université d'Orleans, 1998.

  272. Brown, R.D. and Martin, Y.C., The information content of 2D and 3D structural descriptors relevant to ligand-receptor binding, J. Chem. Inf. Comput. Sci., 37 (1997) 1–9.

    Article  CAS  Google Scholar 

  273. Trepalin, S.V., Gerasimenko, V.A., Kozyukov, A.V., Savchuk, N.P. and Ivanschenko, A.A., New diversity calculations algorithms used for compound selection, J. Chem. Inf. Comput. Sci., 42 (2002) 249–258.

    Article  CAS  Google Scholar 

  274. Boon, G., Langenaeker, W., De Proft, F., De Winter, H., Tollenaere, J.P. and Geerlings, P., Systematic study of the quality of various quantum similarity descriptors. Use of the autocorrelation function and principal component analysis, J. Phys. Chem. A., 105 (2001) 8805–8814.

    Article  CAS  Google Scholar 

  275. White, M. and Willett, P., Evaluation of similarity measures for searching the dictionary of natural products database, J. Chem. Inf. Comput. Sci., 43 (2003) 449–457.

    Google Scholar 

  276. Dixon, S.L. and Koehler, R.T., The hidden component of size in two-dimensional fragment descriptors: Side effects on sampling in bioactive libraries, J. Med. Chem., 42 (1999) 2887–2900.

    Article  CAS  Google Scholar 

  277. Fligner, M.A., Verducci, J.S. and Blower, P.E., A modification of the Jaccard-Tanimoto similarity index for diverse selection of chemical compounds using binary strings, Technometrics, 44 (2002) 110–119.

    Article  Google Scholar 

  278. Salim, N., Analysis and comparison of molecular similarity measures, Thesis book, University of Sheffield, 2002.

  279. Chen, X. and Reynolds, C.H., Performance of similarity measures in 2D fragment-based similarity searching: Comparison of structural descriptors and similarity coefficients, J. Chem. Inf. Comput. Sci., 42 (2002) 1407–1414.

    CAS  Google Scholar 

  280. Xue, L., Godden, J.W., Stahura, F.L. and Bajorath, J., Design and evaluation of a molecular fingerprint involving the transformation of property descriptor values into a binary classification scheme, J. Chem. Inf. Comput. Sci., 43 (2003) 1151–1157.

    CAS  Google Scholar 

  281. Whitley, D.C., Ford, M.G. and Livingstone, D.J., Unsupervised forward selection: A method for eliminating redundant variables, J. Chem. Inf. Comput. Sci., 40 (2000) 1160–1168.

    Article  CAS  Google Scholar 

  282. Rarey, M. and Dixon, J.S., Feature trees: A new molecular similarity measure based on tree matching, J. Comput.-Aided Molec. Design, 12 (1998) 471–490.

    CAS  Google Scholar 

  283. Dixon, S.L. and Villar, H.O., Bioactive diversity and screening library selection via affinity fingerprinting, J. Chem. Inf. Comput. Sci., 38 (1998) 1192–1203.

    Article  CAS  Google Scholar 

  284. Randic, M. and Basak, S., A new descriptor for structure-property and structure-activity correlations, J. Chem. Inf. Comput. Sci., 41 (2001) 650–656.

    CAS  Google Scholar 

  285. Ehresmann, B., de Groot, M.J., Alex, A. and Clark, T., New molecular descriptors based on local properties at the molecular surface and a boiling-point model derived from them, J. Chem. Inf. Comput. Sci., 44 (2004) 658–668.

    Article  CAS  Google Scholar 

  286. Whittle, M., Gillet, V. and Willett, P., Enhancing the effectiveness of virtual screening by fusing nearest neighbor lists: A comparison of similarity coefficients, J. Chem. Inf. Comput. Sci., 44 (2004) 1840–1848.

    Article  CAS  Google Scholar 

  287. Godden, J.W., Stahura, F.L. and Bajorath, J., Variability of molecular descriptors in compound databases revealed by shannon entropy calculations, J. Chem. Inf. Comput. Sci., 40 (2000) 796–800.

    CAS  Google Scholar 

  288. Xue, L., Godden, J.W., Stahura, F.L. and Bajorath, J., Similarity searching profiles as a diagnostic tool for the analysis of virtual screening calculations, J. Chem. Inf. Comput. Sci., 44 (2004).

  289. Sun, H., A universal molecular descriptor system for prediction of logP, logS, logBB and absorption, J. Chem. Inf. Comput. Sci., 44 (2004) 748–757.

    CAS  Google Scholar 

  290. Feng, J., Lurati, L. and Ouyang, H., Predictive toxicology: Benchmarking molecular descriptors and statistical methods, J. Chem. Inf. Comput. Sci., 43 (2003) 1463–1470.

    CAS  Google Scholar 

  291. Schuffenhauer, A., Gillet, V.J. and Willett, P., Similarity searching in files of three-dimensional chemical structures: Analysis of the BIOSTER database using two-dimensional fingerprints and molecular field descriptors, J. Chem. Inf. Comput. Sci., 40 (2000) 295–307. 1275–1281.

    Google Scholar 

  292. Hicks, M.G. and Jochum, C., Substructure search systems. 1. Performance comparison of the MACCS, DARC, HTSS, CAS Registry MVSSS and S4 substructure search systems, J. Chem. Inf. Comput. Sci., 30 (1990) 191–199.

    CAS  Google Scholar 

  293. Good, A.C. and Richards, W.G., Explicit calculation of 3D molecular Similarity, Perspectiv. Drug Disc. Design, 9/10/11 (1998) 321–338.

  294. Flower, D.R., DISSIM: A program for the analysis of chemical diversity, J. Molec. Graph. Mod., 16 (1998) 239–253.

    CAS  Google Scholar 

  295. Brown, R.D. and Martin, Y.C., Use of structure-activity data to compare structure-based clustering methods and descriptors for use in compounds selection, J. Chem. Inf. Comput. Sci., 36 (1996) 572–584.

    Article  CAS  Google Scholar 

  296. Moos, W.H., Combinatorial chemistry: A “Molecular diversity space” Odyssey approaches 2001, Pharmaceutical News, 3 (1996) 23–26.

    CAS  Google Scholar 

  297. Information available at the following URL: http://www.5z.com/divinfo/reviews.html

  298. Blaney, J.M. and Martin, E.J., Computational approaches for combinatorial library design and molecular diversity analysis, Curr. Opin. Chem. Biol., 1 (1997) 54–59.

    Article  CAS  Google Scholar 

  299. Pavia, M.R., Sawyer, T.K. and Moos, W.H., The generation of molecular diversity, BioMed. Chem. Lett., 3 (1993) 387–396.

    CAS  Google Scholar 

  300. Stu, Borman, The many faces of combinatorial chemistry, Chem. Engin. News, 81 (2003) 45–56.

    Google Scholar 

  301. Blaney, J.M. and Martin, E.J., Computational approaches for combinatorial library design and molecular diversity analysis, Curr. Op. Chem. Bio., 1 (1997) 54–59.

    CAS  Google Scholar 

  302. Willett, P., Using computational tools to analyze molecular diversity, in DeWitt, H. and Czarnik, A.W. (Eds.), Combinatorial Chemistry; A Short Course, American Chemical Society Books, Washington DC, 1997.

  303. Martin, Y.C., Brown, R.D. and Bures, M.G., Quantifying diversity, in Kerwin, J.F. and Gordon, E.M. (Eds.), Combinatorial Chemistry and Molecular Diversity, Wiley & Sons, New York, 1998.

    Google Scholar 

  304. Weber, L., High-diversity combinatorial libraries, Curr. Op. Chem. Bio., 4 (2000) 295–302.

    CAS  Google Scholar 

  305. Information available at the following URL: http://www.5z.com/divinfo/links/

  306. Information available at the following URL: http://www.combichemlab.com/

  307. Gute, B.D. and Basak, S.C., Molecular similarity-based estimation of properties: A comparison of three structure spaces, Mol. Graph. Mod., 20 (2001) 95–109.

    CAS  Google Scholar 

  308. Taraviras, S., Evaluation de la diversité moléculaire des bases de données de molécules à l'intérêt pharmaceutique, en utilisant la théorie des graphes chimiques, Livre de thèse, Université de Nice-Sophia Antipolis, 2000.

  309. Martin, E. and Wong, A., Sensitivity analysis and other improvements to tailored combinatorial library design, J. Chem. Inf. Comput. Sci., 40 (2000) 215–220.

    Article  CAS  Google Scholar 

  310. MDL Information Systems, Inc., 14600 Catalina Street, San Leandro, CA 94577, USA. For more information see the URL: http://www.mdli.com

  311. Daylight Chemical Information Systems, Inc., 441 Greg Avenue, Santa Fe, NM 87501, USA. For more information see the URL: http://www.daylight.com

  312. CambridgeSoft Corporation, 100 Cambridge Park Drive, Cambridge, MA 02140, USA. For more information see the URL: http://www.camsoft.com

  313. Oxford Molecular Ltd. Medawar Centre, Oxford Science Park, Sandford-on-Thames, Oxford, OX4 4GA, UK. For more information see the URL: http://www.oxmol-.co.uk/

  314. Synopsys Scientific Systems Ltd. 175 Woodhouse Lane, Leeds, LS2 3AR, UK. For more information see the URL: http://www.synopsys.co.uk/

  315. Agrafiotis, D.K., Lobanov, V.S. and Salemme, F.R., Combinatorial informatics in the post-genomics era, Nature Reviews Drug Discovery, 1 (2002) 337–346.

    Article  CAS  Google Scholar 

  316. Schuffenhauer, A., Popov, M., Schopfer, U., Acklin, P., Stanek, J. and Jacoby, E., Molecular mangement strategies for building and enhancement of diver and focused lead discovery compound screenin collections. Comb. Chem. & HTS, 7 (2004) 771–781.

  317. Miller, J.L., Bradley, E.K. and Teig, S.L., Luddite: An information-theoretic library design tool, J. Chem. Inf. Comput. Sci., 43 (2003) 47–54.

    CAS  Google Scholar 

  318. Young, S.S., Wang, M. and Gu, F., Design of diverse and focused combinatorial libraries using an alternative algorithm, J. Chem. Inf. Comput. Sci., 43 (2003) 1916–1921.

    Article  CAS  Google Scholar 

  319. Darvas, F., Dorman G. and Papp A., Diversity measures for enhacing ADME admissibility of combinatorial libraires, J. Chem. Inf. Comput. Sci., 40 (2000) 314–322.

    Article  CAS  Google Scholar 

  320. Talaga, P., Compound decomposition: A new drug discovery tool?, Drug Discovery Today, 9 (2004) 51–53.

    Article  Google Scholar 

  321. Fenniri, H., Recent advances at the interface of medicinal chemistry and combinatorial chemistry. Views on methodologies for the generation and evaluation of diversity and application to molecular recognition and catalysis, Curr. Med. Chem., 3 (1996) 343–378.

    CAS  Google Scholar 

  322. Edgar, S.J., Holliday, J.D. and Willett, P., Effectiveness of retrieval in similarity searches of chemical databases: A review of performance measures, J. Molec. Graph. Mod., 18 (2000) 343–357.

    CAS  Google Scholar 

  323. Stahura, F.L., Xue, L., Godden, J.W. and Bajorath, J., Methods for compound selection focused on hits and application in drug discovery, J. Molec. Graph. Model., 20 (2002) 439–446.

    CAS  Google Scholar 

  324. Bultinck, P., DeWinter, H., Langenaeker, W., Tollenaere J.P., (Eds.), Computational Medicinal Chemistry for Drug Design, Marcel Dekker Inc., New York, 2003.

    Google Scholar 

  325. VanDrie, J.H., 3D Database searching in drug discovery, Network Science. 1996. Available at the following URL: http://www.netsci.org/Science/Cheminform/feature06.html

  326. Böhm, H.J. and Stahl, M., Structure-based library design: Molecular modelling merges with combinatorial chemistry, Curr. Op. Chem. Bio., 4 (2000) 283–286.

    Google Scholar 

  327. Gorse, D. and Lahana, R., Functional diversity of compounds libraries, Curr. Op. Chem. Bio., 4 (2000) 287–294.

    CAS  Google Scholar 

  328. Ghosh, A., Computational bioinorganic chemistry. Part III. The tools of the trade: From high-level ab initio calculations to structural bioinformatics, Curr. Op. Chem. Bio., 7 (2003) 110–112 (Parts I and II, have been published in the same journal).

  329. Kingston, D.G., Natural products as pharmaceuticals and sources for lead structures, In Wermuth, C.G. (Ed.), The Practice of Medicinal Chemistry, Academic Press, London, 1996.

    Google Scholar 

  330. Warr, W.A., Combinatorial chemistry and molecular diversity. An overview, J. Chem. Inf. Comput. Sci., 37 (1997) 134–140.

    Article  CAS  Google Scholar 

  331. Reitz, M., Sacher, O., Tarkhov, A., Trümbach, D. and Gasteiger, J., Enabling the exploration of biochemical pathways, Org. Biomol. Chem., 2 (2004) 3226–3237.

    Article  CAS  Google Scholar 

  332. Stahura, F.L. and Bajorath, J. Virtual screening methods that complements HTS, Comb. Chem. & HTS, 7 (2004) 259–269.

    CAS  Google Scholar 

  333. Fliri, A.F., Loging, W.T., Thadeio, P.F. and Volkmann, R.A., Biological spectra analysis: Linking biological activity profiles to molecular structure, PNAS, 102 (2005) 261–266.

    Article  CAS  Google Scholar 

  334. Moret, M.A., Miranda, J., Nogueira Jr., E., Santana, M.C. and Zebende, G.F., Self-similarity and protein chains, Physical Review E, 71 (2005) 012901.

    Article  CAS  Google Scholar 

  335. Ostberg, N. and Kaznessis, Y., Protegrin structure-activity relationships: Using homology models of synthetic sequences to determine structural characteristics important for activity, Peptides, 26 (2005) 197–206.

    Article  CAS  Google Scholar 

  336. Deng, Z., Chuaqui, C. and Singh, J., Structural interaction fingerprint (SIFt): A novel method for analyzing three-dimensional protein-ligand binding interactions, J. Med. Chem., 47 (2004) 337–344.

    Article  CAS  Google Scholar 

  337. More information available at the following URL: http://www.ucl.ac.uk/oncology/MicroCore/HTML_resource/tut_frameset.htm

  338. Willett, P., Computational tools for the analysis of molecular diversity, Perspectiv. Drug Disc. Design, 7/8 (1997) 1–11.

  339. Xue, L., Stahura, F.L. and Bajorath J., Cell-based partitioning, In Bajorath, J. (Ed.) Methods in Molecular Biology, vol. 275. Chemoinformatics. Concepts, Methods and Tools for Drug Discovery. Humana Press Inc., Totowa, NJ, 2004, pp. 279–289.

  340. Oprea, T. and Matter, H., Integrating virtual screening in lead discovery, Current Opinion in Chemical Biology, 8 (2004) 349–358.

    Article  CAS  Google Scholar 

  341. Hann, M.M. and Oprea, T., Pursuing the leadlikeness concept in pharmaceutical research, Current Opinion in Chemical Biology 8 (2004) 255–263.

    Article  CAS  Google Scholar 

  342. Oprea, T., Next-generation therapeutic, Current opinion in Chemical biology, 8 (2004) 347–348.

    Article  CAS  Google Scholar 

  343. Zamora, I., Oprea, T., Cruciani, G., Pastor, M. and Ungell, A.L., Surface descriptors for protein ligand affinity prediction, J. Med. Chem. 46 (2003) 25–33.

    CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bo-Tao Fan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Maldonado, A.G., Doucet, J.P., Petitjean, M. et al. Molecular similarity and diversity in chemoinformatics: From theory to applications. Mol Divers 10, 39–79 (2006). https://doi.org/10.1007/s11030-006-8697-1

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11030-006-8697-1

Keywords

Navigation