Skip to main content
Log in

A review and examination of the mathematical spaces underlying molecular similarity analysis

  • Review
  • Published:
Journal of Mathematical Chemistry Aims and scope Submit manuscript

Abstract

As an intuitive concept, molecular similarity has played a fundamental role in chemistry. It is implicit in Hammiond's postulate, in the principle of minimum structure change, and in the assumption that similar structures tend to have similar properties, With the advent of large computers, computable definitions of similarity are being used in the pharmaceutical industry for similarity searching, dissimilarity selection, molecular superpositioning, structure generation, and quantitative structure-activity analysis. The diversity of applications of computable definitions of molecular similarity has often obscured important mathematical commonalities underlying these definitions. The broadest commonalities are relationships based of equivalence, matching, partial ordering, and proximity. A mathematical space suitable for molecular similarity analysis consists of a set of mathematical structures and one or more of these similarity relationships defined on that set. This report Surveys the mathematical spaces used in molecular similarity analysis. The survey covers the types of chemical information, similarity relationships, and applications associated with the use of each mathematical space in a molecular similarity context.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. C.L. Wilkins and M. Randić Theor. Chim. Acta 58 (1979)451

    Google Scholar 

  2. C. Jochum, J. Gasteiger and I. Ugi, Angew. Chemie Int. 19 (1980)4951

    Google Scholar 

  3. G.W. Klump,Reactivity in Organic Chemistry (Wiley, New York, 1982).

    Google Scholar 

  4. G.M. Crippen, J. Med. Chem. 22 (1919)988.

    Google Scholar 

  5. Z. Simon, A. Chiriac, S. Holban, D. Ciubotaru and G.I. Mihalas,Minimum Sterzic Difference: The MTD Method for QSAR Studies (Research Studies Press Ltd., Letchworth, 1984).

    Google Scholar 

  6. I.D. Kuntz, J.M. Blaney, S.J. Oatley, R. Langridge and T.E. Ferrin, J. Mol. Biol. 161 (1982) 269.

    Google Scholar 

  7. R.L. Deslarlais, R.P. Sheridan, G.L. Seibel, J.S. Dixon, I.D. Kuntz and R. Venkataraghavan, J. Med. Chem. 31 (1988)722.

    Google Scholar 

  8. D.J. DuCharnp, in:Computer-Assisted Drug Design, ed. E.C. Olson and R.E. Christoffersen (ACS Symp. Ser. 112, Amer. Chem. Soc., Washington D.C., 1979) p. 79.

    Google Scholar 

  9. A.J. Hopfinger, in:Quantitative Structure-Activity Relationships (QSAR) in Drug Design, ed. J.L. Fauchère (Alan R. Liss, Inc., New York), to appear.

  10. P.H.A. Sneath, J. Theor. Biol. 12 (1966)157.

    Google Scholar 

  11. D. Sankoff and J.B. Kruskal,Time Warps, String Edits, and Macrotnolecules: The Theory and Practice of Sequence Comparison (Addison-Wesley, London, 1983).

    Google Scholar 

  12. J.P. Tremblay and R. Manohar,Discrete Mathematical Structures with Applications to Computer Science (McGraw-Hill, New York, 1975).

    Google Scholar 

  13. M.A. Johnson, G.M. Maggiora and S. Basak, in:Proc. Sixth Int. Conf on Mathematical Modeling, ed. X. Avula and E.Y. Rodin (Pergamon Press, 1987) 630.

  14. I. Borg and J. Lingoes,Multidimensional Similarity Structure Analysis (Springer-Verlag, New York, 1987).

    Google Scholar 

  15. J.C. Gower, in:Encylopedia of Statistical Sciences, ed. S. Kotz and N.L. Johnson (Wiley, New York) 5 (1985)397.

    Google Scholar 

  16. A.J. Stupor, W.E. Brugger and P.C. Jurs,Computer-Assisted Studies of ChemicalStructure and Biological Function (Wiley, New York, 1979).

    Google Scholar 

  17. A.T. Balaban, I. Motoc, D. Bonchev and O. Mekenyan, in: Steric Effects in Drug Design, ed. M. Charton and I. Motoc (Springer-Verlag, Berlin, 1983) 23.

    Google Scholar 

  18. M. Randić J. Chem. Inf. Sci. 24 (1984)164.

    Google Scholar 

  19. M. Randić, J. Chem. Inf. Sci. 26 (1986)134.

    Google Scholar 

  20. M. Razinger, J.R. Chrétien and J.E. Dubois, J. Chem. Inf. Sci. 25 (1985)23.

    Google Scholar 

  21. K. Szymanski, W.R. Muller, J.V. Knop and N. Trinajstić, Int. J. Quant. Chem.: Quant. Chem. Symp. 20 (1986)173.

    Google Scholar 

  22. M. Randić, Int. J. Quant. Chem.: Quant. Biol. Symp. 11 (1984)137.

    Google Scholar 

  23. G.T. Rasmussen and T.L. Isenhour, J. Chem. Inf. Comput. Sci. 19 (1979)179.

    Google Scholar 

  24. M. Randić and C.L. Wilkins, J. Chem. Inf. Comput. Sci. 19 (1979)31.

    Google Scholar 

  25. R.E. Carhart, D.H. Smith and R. Vankataraghavan, J. Chem. Inf. Comput. Sci. 25 (1985)64.

    Google Scholar 

  26. P. Willett, V. Wintermann and D. Bawden, J. Chem. Inf. Comput. Sci. 26 (1986)109.

    Google Scholar 

  27. P. Willett and V. Wintermann, Quant. Struct.- Act. Relat. 5 (1986)18.

    Google Scholar 

  28. S.C. Basak, V.R. Magnuson, G.J. Niemi and R.R. Regal, Discrete Appl. Math. 19 (1987)17.

    Google Scholar 

  29. M.A. Johnson, M.S. Lajiness and G.M. Maggiora, in:Quantitative Structure-Activity Relationships (QSAR) in Drug Design, ed. J.L. Fauchère (Alan R. Liss, Inc., New York, 1989)p.167.

    Google Scholar 

  30. C. Hansch, S.H. Unger and A.B. Forsythe, J. Med. Chem. 16 (1973)1217.

    Google Scholar 

  31. P. Willett,Similarity and Clustering in Chemical Information Systems (Research Studies Press, Letchworth, 1987).

    Google Scholar 

  32. M.F. Lynch, in:Chemical Information Systems, ed. J.E. Ash and E. Hyde (Ellis Horwood, Chichester, 1975) Ch. 12.

    Google Scholar 

  33. R.N. Shepard, A.K. Romney and S.B. Nerlove,Multidimensional Scaling, Vol. 1, (Seminar Press, New York, 1972).

    Google Scholar 

  34. T.M. Cover, IEEE Trans. Inf. Theory 14 (1968)50.

    Google Scholar 

  35. J. van Ryzin, Classification and Clustering (Academic Press, New York, 1977).

    Google Scholar 

  36. B.R. Kowalski and C.F. Bender, J. Amer. Chem. Soc. 94 (1972)5632.

    Google Scholar 

  37. J.W. McFarland and D.J. Gans, J. Med. Chem. 29 (1986)505.

    Google Scholar 

  38. M.F. Delaney, J.R. Hallowell, Jr. and F.V. Warren, Jr., J. Chem. Inf. Comput. Sci. 25 (1985)27.

    Google Scholar 

  39. D. Grier, W.D. Hounshell, T. Moock and G. Grethe, Poster at Amer. Chem. Soc. Mtg., Los Angeles, CA (1988).

  40. M.S. Lajiness, M.A. Johnson and G.M. Maggiora, in:Quantitative Structure-Activity Relationships (QSAR) in Drug Design, ed. J.L. Fauchère (Alan R. Liss, Inc., New York, 1989)p.173.

    Google Scholar 

  41. W.J. Streich, S. Dove and R. Franke, J. Med. Chem. 23 (1980)1451

    Google Scholar 

  42. R. Wootton, J. Med. Chem. 26 (1983)275.

    Google Scholar 

  43. M. Randić, B. Jerman-Blažić, D.H. Rouvray, P.G. Seybold and S.C. Grossman, Int. J. Quant. Chem.: Quant. Chem. Biol. Symp. (1987), forthcoming.

  44. C.L. Wilkins, M. Randić, S.M. Schuster, R.S. Markin, S. Steiner and L. Dorgan, Anal. Chem. Acta 133 (1981)637.

    Google Scholar 

  45. D. Bonchev and N. Trinajstić, Int. J. Quant. Chem.: Quant. Chem. Symp. 16 (1982)463.

    Google Scholar 

  46. M. Barysz, N. Trinajstic and J.V. Knop, Int. J. Quant. Chem.: Quant. Chem. Symp. 17 (1983)441.

    Google Scholar 

  47. B. Jerman-Blažič, I. Fabič and W. Randić, J. Comp. Chem. 7 (1986)176.

    Google Scholar 

  48. H. Jeffreys,Theory of Probability (Clarendon Press, Oxford, 1961).

    Google Scholar 

  49. C.R. Rao, in:Classification and Clustering, ed. J. van Ryzin (Academic Press, New York, 1977)p.175.

    Google Scholar 

  50. S.H. Bertz and W.C. Herndon, in:Artificial Intelligence Applications in Chemistry, ed. T.H. Pierce and B.A. Hohne (ACS Symp. Ser. 306, Amer. Chem. Soc., Washington D.C., 1986)169.

    Google Scholar 

  51. V. Nicholson, C.-C. Tsai, M. Johnson and M. Naim, in:Graph Theory and Toplogy in Chemistry, ed. R.B. King and D.H. Rouvray (Elsevier, Amsterdam, 1987) p. 226.

    Google Scholar 

  52. G. Chartrand and L. Lesniak,Graphs and Digraphs (Wadsworth and Brooks, Monterey, 1986).

    Google Scholar 

  53. I. Ugi, J. Bauer, J. Brandt, J. Friedrich, J. Gasteiger, C. Jochum and W. Schubert, Angew. Chem. Int. Ed. Engl. 18 (1979)111.

    Google Scholar 

  54. M. Johnson, M. Naim, V. Nicholson and C.-C. Tsai, in:Graph Theory and Topology in Chemistry, ed. R.B. King and D.H. Rouvray (Elsevier, Amsterdam, 1987) p. 219.

    Google Scholar 

  55. C.W. Crandell and D.H. Smith, J. Chem. Inf. Comput. Sci. 23 (1983)186.

    Google Scholar 

  56. M. Wochner, J. Brandt, A. van Scholley and I. Ugi, Chmia 42 (1988)217.

    Google Scholar 

  57. M.A. Johnson, in:Graph Theory and Its Applications to Algorithms and Computer Science, ed. Y. Alavi, G. Chartrand, L. Lesniak, D.R. Lick and C.E. Wall (Wiley, New York, 1985) p.457.

    Google Scholar 

  58. V. Baáž, J. Kola, V. Kvasníčka and M. Sekanina, Casopis Pro Pest. Mat. 111 (1986)431.

    Google Scholar 

  59. C.-C. Tsai, V. Nicholson, M.A. Johnson and M. Naim, in:Graph Theory and Topology in Chemistry, ed. R.B. King and D.H. Rouvray (Elsevier, Amsterdam, 1987) p. 231.

    Google Scholar 

  60. M.M. Cone, R. Venkataragliavan and F.W. McLafferty, J. Amer. Chem. Soc. 99 (1977)7668.

    Google Scholar 

  61. J.B. Hendrickson and E. Braun-Keller, J. Comput. Chem. 1 (1980)323.

    Google Scholar 

  62. J. Ash, P. Chubb, S. Ward, S. Welford and P. Willett,Communication, Storage and Retrieval of Chemical Information (Horwood, Chichester, 1985).

    Google Scholar 

  63. A.T. Brint and P. Willett, J. Mal. Graphics 5 (1987)200.

    Google Scholar 

  64. Braser Williams (Scientific Systems) Ltd. brochure, Cheshire, U.K. (1988).

  65. M.P. Lynch and P. Willett, J. Chem. Inf. Comput. Sci. 18 (1978)154.

    Google Scholar 

  66. M.A. Johnson, in:Proc. Sixth Int. Conf. on the Theory and Applications of Graphs, ed. Y. Alavi, G. Chartrand, O. Oellermann and A.J. Schwenk (Wiley, New York, 1988), to appear.

    Google Scholar 

  67. S. Fujita, J. Chem. Inf. Comput. Sci. 26 (1986)205.

    Google Scholar 

  68. S. Fujita, J. Chem. Inf. Comput. Sci. 27 (1987)120.

    Google Scholar 

  69. J. Bauer, R. Herges, E. Fontain and I. Ugi, Chimia 39 (1985)43.

    Google Scholar 

  70. E. Fontain, J. Bauer and I. Ugi, Chem. Lett. (1987)37.

  71. M.A. Johnson, M. Naim, V. Nicholson and C.-C. Tsai, in:QSAR in Drug-Design and Toxicology, ed. D. Hadži and B. Jerman-Blažić (Elsevier, Amsterdam, 1987) p. 67.

    Google Scholar 

  72. E.G. Smith and P.A. Baker,The Wiswesser Line-Formula Chemical Notation (Chemical Information Management, Inc., Cherry Hill, 1975).

    Google Scholar 

  73. Med. Chem. Software Manual (Medicinal Chemistry Project, Pomona College, Claremont, CA, 1984).

  74. W.T. Wipke and T.M. Dyott, J. Amer. Chem. Soc. 96 (1974)4834.

    Google Scholar 

  75. R.C. Read, J. Chem. Inf. Comput. Sci. 23 (1983)135.

    Google Scholar 

  76. W.C. Herndon and S.H. Bertz, J. Comput. Chem. 8 (1987)367.

    Google Scholar 

  77. M. Randić Int. J. Quant. Chem.: Quant. Chem. Biol. Symp. 5 (1978)245.

    Google Scholar 

  78. M. Randić and C.L. Wilkins, Int. J. Quant. Chem.: Quant. Chem. Biol. Symp. 6 (1979)55.

    Google Scholar 

  79. W.J. Conover,Practical Nonparametric Statistics (Wiley, New York, 1971).

    Google Scholar 

  80. J.M. Coggins, in:Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, ed. D. Sankoff and J.B. Kruskal (Addison-Wesley, London, 1983) Ch. 11.

    Google Scholar 

  81. J.B. Kruskal, in:Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, ed. D. Sankoff and J.B. Kruskal (Addison-Wesley, London, 1983) Ch. 1.

    Google Scholar 

  82. G.W. Adamson and D. Bawden, J. Chem. Inf. Comput. Sci. 15 (1975)215.

    Google Scholar 

  83. L.J. Soltzberg and C.L. Wilkins, J. Amer. Chem. Soc. 99 (1977)439.

    Google Scholar 

  84. Z. Gabanyi, P. Surjan and G. Naray-Szabó, Eur. J. Med. Chem. 17 (1982)307.

    Google Scholar 

  85. A.D. McLachlan, Acta Cryst. A38 (1982)871.

    Google Scholar 

  86. P.M. Dean and P.-L. Chau, J. Mol. Graphics 5 (1987)152.

    Google Scholar 

  87. M. Marsili, P. Floersheim and A.S. Dreiding, Comput. & Chem. 7 (1983)175.

    Google Scholar 

  88. R. Carbó and L. Domingo, Int. J. Quant. Chem. 32 (1987)517.

    Google Scholar 

  89. A.J. Hopfinger, J. Amer. Chem. Soc. 102 (1980)7196.

    Google Scholar 

  90. D.J. Danziger and P.M. Dean, J. Theor. Biol. 116 (1985)215.

    Google Scholar 

  91. P.G. Mezey, Int. J. Quant. Chem. 26 (1984)983.

    Google Scholar 

  92. P.G. Mezey, Int. J. Quant. Chem.: Quant. Chem. Biol. Symp. 17 (1983)137.

    Google Scholar 

  93. E.B. Wilson, Jr., J.C. Decius and P.C. Cross,Molecular Vibrations (McGraw-Hill, New York, 1955) p. 14.

    Google Scholar 

  94. P.G. Mezey,Potential Energy Hypersurfaces (Elsevier, Amsterdam, 1987).

    Google Scholar 

  95. B. Everitt,Cluster Analysis (Halstead Press, New York, 1980) Ch. 2.

    Google Scholar 

  96. A.T. Balaban, A. Chiriac, I. Motos and Z. Simon,Steric Fit in Quantitative StructureActivity Relations (Springer-Verlag, Berlin, 1980) Ch. 4.

    Google Scholar 

  97. P. Gund, W.T. Wipke and R. Langridge, in:Pros. Int. Conf. on Computers in Chemical Research and Education, ed. D. Hadzi (Elsevier, Amsterdam, 1973) p. 5.

    Google Scholar 

  98. A.T. Brint and P. Willett, J. Mol. Graphics 5 (1987)49.

    Google Scholar 

  99. G.A. Arteca and P. Mezey, Int. J. Quant. Chem.: Quant. Biol. Symp. 14 (1987)133.

    Google Scholar 

  100. G.A. Arteca and P. Mezey, J. Comput. Chem. 9 (1988)728.

    Google Scholar 

  101. P.L. Chau and P.M. Dean, J. Mol. Graphics 5 (1987)97.

    Google Scholar 

  102. S.E. Leicester, J.L. Finney and R.P. Bywater, J. Mol. Graphics 6 (1988)104.

    Google Scholar 

  103. J.E. Moore, G. Palmieri and E. Wanke, Nature 216 (1967)1084.

    Google Scholar 

  104. G.R. Marshall, C.D. Barry, H.E. Bosshard, R.A. Damnikoehler and D.A. Dunn, in:Computer-Assisted Design ed. E.C. Olson and R.E. Christoffersen (American Chemical Society, Washington D.C., 1979) p. 205.

    Google Scholar 

  105. I. Motos, G.R. Marshall, R.A. Dammkoehler and J. Labanowski, Z. Naturforsch. 40a (1985)110K

    Google Scholar 

  106. T.R. Slouch and P.C. Jurs, J. Chem. Inf. Comput. Sci. 26 (1986)4.

    Google Scholar 

  107. R. Cark, L. Leyda and M. Arnau, Int. I Quant Chem. 17 (1980)1183.

    Google Scholar 

  108. K. Nishikawa and T. Ooi, J. Theor. Biol. 43 (1974)351.

    Google Scholar 

  109. M.N. Liebman,Molecular Structure and Biological Activity, ed. J. Griffin and W.L. Duax (Elsevier, New York, 1982) p.193.

    Google Scholar 

  110. P. Mezey, Int. J. Quant. Chem.: Quant. Biol. Symp. 12 (1986)113.

    Google Scholar 

  111. P. Mezey, J. Comput. Chem. 8 (1987)462.

    Google Scholar 

  112. P. Mezey, Theor. Chien. Acta (Berl.) 67 (1985)91.

    Google Scholar 

  113. P. Mezey, Int. J. Quant. Chem.' Quant. Biol. Symp. 14 (1987)127.

    Google Scholar 

  114. G.A. Arteca, V.B. Jaminal and P.G. Mezey, J. Comp. Chem. 9 (1988)608.

    Google Scholar 

  115. H. Beierbeck, J. Chem. Inf. Comput. Sci. 22 (1982)215.

    Google Scholar 

  116. W.T. Wipke and T.M. Dyott, J. Amer. Chem. Soc. 96 (1974)4834.

    Google Scholar 

  117. K. Wirth, J. Chem. Inf. Comput. Sci. 26 (1986)242.

    Google Scholar 

  118. B. Monjardet, Discrete Math. (1981)173.

  119. S.A. Boorman and P. Arabic, in:Multidimensional Scaling, Vol. 1, ed. R.N. Shepard, A.K. Romney and S.B. Nerlove (Seminar Press, New York, 1972) p.225.

    Google Scholar 

  120. S.A. Boorman and D.C. Oliver, J. Math. Psychol. 10 (1973)26.

    Google Scholar 

  121. J.-P. Barthélemy, Math. Sci. Hum. 16 (1978)39.

    Google Scholar 

  122. D. Dubois and H. Prade,Fuzzy Sets and Systems: Theory and Applications (Academic Press, Orlando, 1980).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Johnson, M.A. A review and examination of the mathematical spaces underlying molecular similarity analysis. J Math Chem 3, 117–145 (1989). https://doi.org/10.1007/BF01166045

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01166045

Keywords

Navigation