Abstract
As an intuitive concept, molecular similarity has played a fundamental role in chemistry. It is implicit in Hammiond's postulate, in the principle of minimum structure change, and in the assumption that similar structures tend to have similar properties, With the advent of large computers, computable definitions of similarity are being used in the pharmaceutical industry for similarity searching, dissimilarity selection, molecular superpositioning, structure generation, and quantitative structure-activity analysis. The diversity of applications of computable definitions of molecular similarity has often obscured important mathematical commonalities underlying these definitions. The broadest commonalities are relationships based of equivalence, matching, partial ordering, and proximity. A mathematical space suitable for molecular similarity analysis consists of a set of mathematical structures and one or more of these similarity relationships defined on that set. This report Surveys the mathematical spaces used in molecular similarity analysis. The survey covers the types of chemical information, similarity relationships, and applications associated with the use of each mathematical space in a molecular similarity context.
Similar content being viewed by others
References
C.L. Wilkins and M. Randić Theor. Chim. Acta 58 (1979)451
C. Jochum, J. Gasteiger and I. Ugi, Angew. Chemie Int. 19 (1980)4951
G.W. Klump,Reactivity in Organic Chemistry (Wiley, New York, 1982).
G.M. Crippen, J. Med. Chem. 22 (1919)988.
Z. Simon, A. Chiriac, S. Holban, D. Ciubotaru and G.I. Mihalas,Minimum Sterzic Difference: The MTD Method for QSAR Studies (Research Studies Press Ltd., Letchworth, 1984).
I.D. Kuntz, J.M. Blaney, S.J. Oatley, R. Langridge and T.E. Ferrin, J. Mol. Biol. 161 (1982) 269.
R.L. Deslarlais, R.P. Sheridan, G.L. Seibel, J.S. Dixon, I.D. Kuntz and R. Venkataraghavan, J. Med. Chem. 31 (1988)722.
D.J. DuCharnp, in:Computer-Assisted Drug Design, ed. E.C. Olson and R.E. Christoffersen (ACS Symp. Ser. 112, Amer. Chem. Soc., Washington D.C., 1979) p. 79.
A.J. Hopfinger, in:Quantitative Structure-Activity Relationships (QSAR) in Drug Design, ed. J.L. Fauchère (Alan R. Liss, Inc., New York), to appear.
P.H.A. Sneath, J. Theor. Biol. 12 (1966)157.
D. Sankoff and J.B. Kruskal,Time Warps, String Edits, and Macrotnolecules: The Theory and Practice of Sequence Comparison (Addison-Wesley, London, 1983).
J.P. Tremblay and R. Manohar,Discrete Mathematical Structures with Applications to Computer Science (McGraw-Hill, New York, 1975).
M.A. Johnson, G.M. Maggiora and S. Basak, in:Proc. Sixth Int. Conf on Mathematical Modeling, ed. X. Avula and E.Y. Rodin (Pergamon Press, 1987) 630.
I. Borg and J. Lingoes,Multidimensional Similarity Structure Analysis (Springer-Verlag, New York, 1987).
J.C. Gower, in:Encylopedia of Statistical Sciences, ed. S. Kotz and N.L. Johnson (Wiley, New York) 5 (1985)397.
A.J. Stupor, W.E. Brugger and P.C. Jurs,Computer-Assisted Studies of ChemicalStructure and Biological Function (Wiley, New York, 1979).
A.T. Balaban, I. Motoc, D. Bonchev and O. Mekenyan, in: Steric Effects in Drug Design, ed. M. Charton and I. Motoc (Springer-Verlag, Berlin, 1983) 23.
M. Randić J. Chem. Inf. Sci. 24 (1984)164.
M. Randić, J. Chem. Inf. Sci. 26 (1986)134.
M. Razinger, J.R. Chrétien and J.E. Dubois, J. Chem. Inf. Sci. 25 (1985)23.
K. Szymanski, W.R. Muller, J.V. Knop and N. Trinajstić, Int. J. Quant. Chem.: Quant. Chem. Symp. 20 (1986)173.
M. Randić, Int. J. Quant. Chem.: Quant. Biol. Symp. 11 (1984)137.
G.T. Rasmussen and T.L. Isenhour, J. Chem. Inf. Comput. Sci. 19 (1979)179.
M. Randić and C.L. Wilkins, J. Chem. Inf. Comput. Sci. 19 (1979)31.
R.E. Carhart, D.H. Smith and R. Vankataraghavan, J. Chem. Inf. Comput. Sci. 25 (1985)64.
P. Willett, V. Wintermann and D. Bawden, J. Chem. Inf. Comput. Sci. 26 (1986)109.
P. Willett and V. Wintermann, Quant. Struct.- Act. Relat. 5 (1986)18.
S.C. Basak, V.R. Magnuson, G.J. Niemi and R.R. Regal, Discrete Appl. Math. 19 (1987)17.
M.A. Johnson, M.S. Lajiness and G.M. Maggiora, in:Quantitative Structure-Activity Relationships (QSAR) in Drug Design, ed. J.L. Fauchère (Alan R. Liss, Inc., New York, 1989)p.167.
C. Hansch, S.H. Unger and A.B. Forsythe, J. Med. Chem. 16 (1973)1217.
P. Willett,Similarity and Clustering in Chemical Information Systems (Research Studies Press, Letchworth, 1987).
M.F. Lynch, in:Chemical Information Systems, ed. J.E. Ash and E. Hyde (Ellis Horwood, Chichester, 1975) Ch. 12.
R.N. Shepard, A.K. Romney and S.B. Nerlove,Multidimensional Scaling, Vol. 1, (Seminar Press, New York, 1972).
T.M. Cover, IEEE Trans. Inf. Theory 14 (1968)50.
J. van Ryzin, Classification and Clustering (Academic Press, New York, 1977).
B.R. Kowalski and C.F. Bender, J. Amer. Chem. Soc. 94 (1972)5632.
J.W. McFarland and D.J. Gans, J. Med. Chem. 29 (1986)505.
M.F. Delaney, J.R. Hallowell, Jr. and F.V. Warren, Jr., J. Chem. Inf. Comput. Sci. 25 (1985)27.
D. Grier, W.D. Hounshell, T. Moock and G. Grethe, Poster at Amer. Chem. Soc. Mtg., Los Angeles, CA (1988).
M.S. Lajiness, M.A. Johnson and G.M. Maggiora, in:Quantitative Structure-Activity Relationships (QSAR) in Drug Design, ed. J.L. Fauchère (Alan R. Liss, Inc., New York, 1989)p.173.
W.J. Streich, S. Dove and R. Franke, J. Med. Chem. 23 (1980)1451
R. Wootton, J. Med. Chem. 26 (1983)275.
M. Randić, B. Jerman-Blažić, D.H. Rouvray, P.G. Seybold and S.C. Grossman, Int. J. Quant. Chem.: Quant. Chem. Biol. Symp. (1987), forthcoming.
C.L. Wilkins, M. Randić, S.M. Schuster, R.S. Markin, S. Steiner and L. Dorgan, Anal. Chem. Acta 133 (1981)637.
D. Bonchev and N. Trinajstić, Int. J. Quant. Chem.: Quant. Chem. Symp. 16 (1982)463.
M. Barysz, N. Trinajstic and J.V. Knop, Int. J. Quant. Chem.: Quant. Chem. Symp. 17 (1983)441.
B. Jerman-Blažič, I. Fabič and W. Randić, J. Comp. Chem. 7 (1986)176.
H. Jeffreys,Theory of Probability (Clarendon Press, Oxford, 1961).
C.R. Rao, in:Classification and Clustering, ed. J. van Ryzin (Academic Press, New York, 1977)p.175.
S.H. Bertz and W.C. Herndon, in:Artificial Intelligence Applications in Chemistry, ed. T.H. Pierce and B.A. Hohne (ACS Symp. Ser. 306, Amer. Chem. Soc., Washington D.C., 1986)169.
V. Nicholson, C.-C. Tsai, M. Johnson and M. Naim, in:Graph Theory and Toplogy in Chemistry, ed. R.B. King and D.H. Rouvray (Elsevier, Amsterdam, 1987) p. 226.
G. Chartrand and L. Lesniak,Graphs and Digraphs (Wadsworth and Brooks, Monterey, 1986).
I. Ugi, J. Bauer, J. Brandt, J. Friedrich, J. Gasteiger, C. Jochum and W. Schubert, Angew. Chem. Int. Ed. Engl. 18 (1979)111.
M. Johnson, M. Naim, V. Nicholson and C.-C. Tsai, in:Graph Theory and Topology in Chemistry, ed. R.B. King and D.H. Rouvray (Elsevier, Amsterdam, 1987) p. 219.
C.W. Crandell and D.H. Smith, J. Chem. Inf. Comput. Sci. 23 (1983)186.
M. Wochner, J. Brandt, A. van Scholley and I. Ugi, Chmia 42 (1988)217.
M.A. Johnson, in:Graph Theory and Its Applications to Algorithms and Computer Science, ed. Y. Alavi, G. Chartrand, L. Lesniak, D.R. Lick and C.E. Wall (Wiley, New York, 1985) p.457.
V. Baáž, J. Kola, V. Kvasníčka and M. Sekanina, Casopis Pro Pest. Mat. 111 (1986)431.
C.-C. Tsai, V. Nicholson, M.A. Johnson and M. Naim, in:Graph Theory and Topology in Chemistry, ed. R.B. King and D.H. Rouvray (Elsevier, Amsterdam, 1987) p. 231.
M.M. Cone, R. Venkataragliavan and F.W. McLafferty, J. Amer. Chem. Soc. 99 (1977)7668.
J.B. Hendrickson and E. Braun-Keller, J. Comput. Chem. 1 (1980)323.
J. Ash, P. Chubb, S. Ward, S. Welford and P. Willett,Communication, Storage and Retrieval of Chemical Information (Horwood, Chichester, 1985).
A.T. Brint and P. Willett, J. Mal. Graphics 5 (1987)200.
Braser Williams (Scientific Systems) Ltd. brochure, Cheshire, U.K. (1988).
M.P. Lynch and P. Willett, J. Chem. Inf. Comput. Sci. 18 (1978)154.
M.A. Johnson, in:Proc. Sixth Int. Conf. on the Theory and Applications of Graphs, ed. Y. Alavi, G. Chartrand, O. Oellermann and A.J. Schwenk (Wiley, New York, 1988), to appear.
S. Fujita, J. Chem. Inf. Comput. Sci. 26 (1986)205.
S. Fujita, J. Chem. Inf. Comput. Sci. 27 (1987)120.
J. Bauer, R. Herges, E. Fontain and I. Ugi, Chimia 39 (1985)43.
E. Fontain, J. Bauer and I. Ugi, Chem. Lett. (1987)37.
M.A. Johnson, M. Naim, V. Nicholson and C.-C. Tsai, in:QSAR in Drug-Design and Toxicology, ed. D. Hadži and B. Jerman-Blažić (Elsevier, Amsterdam, 1987) p. 67.
E.G. Smith and P.A. Baker,The Wiswesser Line-Formula Chemical Notation (Chemical Information Management, Inc., Cherry Hill, 1975).
Med. Chem. Software Manual (Medicinal Chemistry Project, Pomona College, Claremont, CA, 1984).
W.T. Wipke and T.M. Dyott, J. Amer. Chem. Soc. 96 (1974)4834.
R.C. Read, J. Chem. Inf. Comput. Sci. 23 (1983)135.
W.C. Herndon and S.H. Bertz, J. Comput. Chem. 8 (1987)367.
M. Randić Int. J. Quant. Chem.: Quant. Chem. Biol. Symp. 5 (1978)245.
M. Randić and C.L. Wilkins, Int. J. Quant. Chem.: Quant. Chem. Biol. Symp. 6 (1979)55.
W.J. Conover,Practical Nonparametric Statistics (Wiley, New York, 1971).
J.M. Coggins, in:Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, ed. D. Sankoff and J.B. Kruskal (Addison-Wesley, London, 1983) Ch. 11.
J.B. Kruskal, in:Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, ed. D. Sankoff and J.B. Kruskal (Addison-Wesley, London, 1983) Ch. 1.
G.W. Adamson and D. Bawden, J. Chem. Inf. Comput. Sci. 15 (1975)215.
L.J. Soltzberg and C.L. Wilkins, J. Amer. Chem. Soc. 99 (1977)439.
Z. Gabanyi, P. Surjan and G. Naray-Szabó, Eur. J. Med. Chem. 17 (1982)307.
A.D. McLachlan, Acta Cryst. A38 (1982)871.
P.M. Dean and P.-L. Chau, J. Mol. Graphics 5 (1987)152.
M. Marsili, P. Floersheim and A.S. Dreiding, Comput. & Chem. 7 (1983)175.
R. Carbó and L. Domingo, Int. J. Quant. Chem. 32 (1987)517.
A.J. Hopfinger, J. Amer. Chem. Soc. 102 (1980)7196.
D.J. Danziger and P.M. Dean, J. Theor. Biol. 116 (1985)215.
P.G. Mezey, Int. J. Quant. Chem. 26 (1984)983.
P.G. Mezey, Int. J. Quant. Chem.: Quant. Chem. Biol. Symp. 17 (1983)137.
E.B. Wilson, Jr., J.C. Decius and P.C. Cross,Molecular Vibrations (McGraw-Hill, New York, 1955) p. 14.
P.G. Mezey,Potential Energy Hypersurfaces (Elsevier, Amsterdam, 1987).
B. Everitt,Cluster Analysis (Halstead Press, New York, 1980) Ch. 2.
A.T. Balaban, A. Chiriac, I. Motos and Z. Simon,Steric Fit in Quantitative StructureActivity Relations (Springer-Verlag, Berlin, 1980) Ch. 4.
P. Gund, W.T. Wipke and R. Langridge, in:Pros. Int. Conf. on Computers in Chemical Research and Education, ed. D. Hadzi (Elsevier, Amsterdam, 1973) p. 5.
A.T. Brint and P. Willett, J. Mol. Graphics 5 (1987)49.
G.A. Arteca and P. Mezey, Int. J. Quant. Chem.: Quant. Biol. Symp. 14 (1987)133.
G.A. Arteca and P. Mezey, J. Comput. Chem. 9 (1988)728.
P.L. Chau and P.M. Dean, J. Mol. Graphics 5 (1987)97.
S.E. Leicester, J.L. Finney and R.P. Bywater, J. Mol. Graphics 6 (1988)104.
J.E. Moore, G. Palmieri and E. Wanke, Nature 216 (1967)1084.
G.R. Marshall, C.D. Barry, H.E. Bosshard, R.A. Damnikoehler and D.A. Dunn, in:Computer-Assisted Design ed. E.C. Olson and R.E. Christoffersen (American Chemical Society, Washington D.C., 1979) p. 205.
I. Motos, G.R. Marshall, R.A. Dammkoehler and J. Labanowski, Z. Naturforsch. 40a (1985)110K
T.R. Slouch and P.C. Jurs, J. Chem. Inf. Comput. Sci. 26 (1986)4.
R. Cark, L. Leyda and M. Arnau, Int. I Quant Chem. 17 (1980)1183.
K. Nishikawa and T. Ooi, J. Theor. Biol. 43 (1974)351.
M.N. Liebman,Molecular Structure and Biological Activity, ed. J. Griffin and W.L. Duax (Elsevier, New York, 1982) p.193.
P. Mezey, Int. J. Quant. Chem.: Quant. Biol. Symp. 12 (1986)113.
P. Mezey, J. Comput. Chem. 8 (1987)462.
P. Mezey, Theor. Chien. Acta (Berl.) 67 (1985)91.
P. Mezey, Int. J. Quant. Chem.' Quant. Biol. Symp. 14 (1987)127.
G.A. Arteca, V.B. Jaminal and P.G. Mezey, J. Comp. Chem. 9 (1988)608.
H. Beierbeck, J. Chem. Inf. Comput. Sci. 22 (1982)215.
W.T. Wipke and T.M. Dyott, J. Amer. Chem. Soc. 96 (1974)4834.
K. Wirth, J. Chem. Inf. Comput. Sci. 26 (1986)242.
B. Monjardet, Discrete Math. (1981)173.
S.A. Boorman and P. Arabic, in:Multidimensional Scaling, Vol. 1, ed. R.N. Shepard, A.K. Romney and S.B. Nerlove (Seminar Press, New York, 1972) p.225.
S.A. Boorman and D.C. Oliver, J. Math. Psychol. 10 (1973)26.
J.-P. Barthélemy, Math. Sci. Hum. 16 (1978)39.
D. Dubois and H. Prade,Fuzzy Sets and Systems: Theory and Applications (Academic Press, Orlando, 1980).
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Johnson, M.A. A review and examination of the mathematical spaces underlying molecular similarity analysis. J Math Chem 3, 117–145 (1989). https://doi.org/10.1007/BF01166045
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF01166045