An Integrated Approach to 2-D and 3-D Similarity Searching for the Cambridge Structural Database (CSD)

  • Eleanor M. Mitchell
  • Frank H. Allen
  • Gary F. Mitchell
  • R. Scott Rowland


Similarity searching in chemical databases depends crucially upon the chosen molecular attribute sets. The current 2-D implementation in the Cambridge Structural Database System uses substructural bit screens as attributes. These contain chemical information at a restricted connectivity level around each atom or bond; the only larger pattern units represented are rings and ring systems. Gross pattern attributes can, however, be assigned in terms of inter-nodal bond separation frequencies established using a shortest path algorithm. This information can be used alone (or in combination with the chemical attributes) to provide an alternative (or enhanced) approach to the 2-D problem. In 3-D, similarity concepts have meaning at both the substructural and full structural levels. A specific chemical substructure may exist in a variety of 3-D conformations. A modified Minkowski metric based on torsion angle descriptors is used to compare 3-D shapes. This results in a 1-D ‘conformational spectrum’ graphical representation in which different conformers often appear in well separated groups for rapid identification. At the full molecular level, comparison of complete distance matrices provides the most complete solution. However, due to the vast computational effort this requires, the distance matrix may be reduced to a distance-frequency distribution. Ultimately it is planned that the 2-D (inter-nodal bond separations) and 3-D (Å distances) approaches will be combined to provide suitable descriptors for similarity work.


Similarity Search Cambridge Structural Database Query Structure City Block Pattern Attribute 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Willett P.; Winterman V. ‘A Comparison of Some Measures for the Determination of Intermolecular Structural Similarity: Measures of Intermolecular Structural Similarity’. Quant Struct. Act. Relat. 1986, 5, 18–25.CrossRefGoogle Scholar
  2. 2.
    Bawden D. ‘Browsing and Clustering of Chemical Structures’. In Chemical Structures. The International Language of Chemistry; Warr W.A., Ed.; Springer-Verlag: Berlin, 1988; pp. 145–150.Google Scholar
  3. 3.
    Willett P.; Winterman V.; Bawden D. Implementation of Nearest-neighbour Searching in an Online Chemical Structure Search System. J. Chem. Inf. Comput. Sci. 1986, 26, 36–41.CrossRefGoogle Scholar
  4. 4.
    CSD System User’s Manual Part I: QUEST89; Cambridge Crystallographic Data Centre: Cambridge, England, 1989.Google Scholar
  5. 5.
    Distance Geometry and Conformational Calculations; Crippen, G.M.; Research Studies Press: Letchworth, 1977.Google Scholar
  6. 6.
    Bersohn M. ‘A Fast Algorithm for Calculation of the Distance Matrix of a Molecule’. J. Comput. Chem. 1983, 4, 110–113.CrossRefGoogle Scholar
  7. 7.
    Randic M.; Wilkins C.L. ‘Graph-based Fragment Searches in Polycyclic Structures’. J. Chem. Inf. Comput. Sci. 1979, 19, 23–37.CrossRefGoogle Scholar
  8. 8.
    Carhart R.E.; Smith D.H.; Venkataraghavan R. ‘Atom Pairs as Molecular Features in Structure-activity Studies: Definition and Applications’. J. Chem. Inf. Comput. Sci. 1985, 25, 64–73.CrossRefGoogle Scholar
  9. 9.
    Willett P. ‘Similarity Coefficients and Weighting Functions for Automatic Document Classification: an Empirical Comparison’. Int. Classif. 1983, 10, 138–142.Google Scholar
  10. 10.
    Similarity and Clustering in Chemical Information Systems; Willett, P.; Research Studies Press: Letchworth, 1987.Google Scholar
  11. 11.
    Cluster Analysis, Everitt B.; Halstead-Heinemann: London, 1980.Google Scholar
  12. 12.
    Morgan H.L. The Generation of a Unique Machine Description for Chemical Structures - a Technique Developed at Chemical Abstracts Service. J. Chem. Doc. 1965, 5, 107–113.CrossRefGoogle Scholar
  13. 13.
    Allen F.H.; Doyle M.J.; Taylor R. ‘Automated Conformational Analysis from Crystalloraphic Data’. Acta Crystallogr. 1991, B47, 50–61.CrossRefGoogle Scholar
  14. 14.
    Jakes S.E.; Willett P. ‘Pharmacophore Pattern Matching in Files of 3-D Chemical Structures: Selection of Interatomic Distance Screens’. J. Mol. Graphics 1986, 4, 12–20.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1993

Authors and Affiliations

  • Eleanor M. Mitchell
    • 1
  • Frank H. Allen
    • 1
  • Gary F. Mitchell
    • 1
  • R. Scott Rowland
    • 2
  1. 1.Cambridge Crystallographic Data CentreUniversity Chemical LaboratoryCambridgeEngland
  2. 2.Department of BiochemistryUniversity of AlabamaUSA

Personalised recommendations