Abstract
For many years, powder X-ray diffraction was used primarily as a fingerprinting method for phase identification in the context of molecular organic materials. In the early 1990s, with only a few notable exceptions, structures of even moderate complexity were not solvable from PXRD data alone. Global optimisation methods and highly-modified direct methods have transformed this situation by specifically exploiting some well-known properties of molecular compounds. This chapter will consider some of these properties.
Keywords
- Global Optimisation
- Atom Type
- Cambridge Structural Database
- Global Optimisation Method
- Molecular Connectivity
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
1 Introduction
Molecular organic materials tend to crystallise in low-symmetry space groups (some 80% crystallise in one of five space group, P21/c, P-1, P212121, P21 and C2/c [1], with % updated for Nov 2010 release of the CSD) and as such exhibit substantial accidental reflection overlap, particularly at high values of 2θ. This, coupled with the fact that molecular materials do not, in general, have strongly scattering elements present to boost measurable intensities at high angle means that it is particularly difficult to extract accurate reflection intensities. As such, the application of un-modified direct methods of crystal structure determination is generally unsuccessful. Spurred on by this failing, numerous groups have developed global optimisation based methods of structure determination, where the position, orientation and conformation of a molecule are adjusted in such a way as to maximise the agreement between observed and calculated diffraction data [2]. Clearly, to be able to perform such an optimisation, one must have a fairly accurate model of the molecule being studied. The vast number of previously determined organic crystal structures provides a rich source of prior information that can be used in the construction of such models ready for optimisation. This is discussed in more detail below.
2 Molecular Connectivity
In single-crystal diffraction, it is not generally necessary to know the molecular connectivity in advance of the diffraction experiment, as the wealth of diffraction data, coupled with the power of direct (and other) methods of structure determination/completion means that the molecular structure usually emerges directly from the Fourier maps. In the case of powder diffraction, this is not normally the case and when applying global optimisation methods, it is necessary to know the molecular connectivity (or at least, a large part of it) in advance of structure determination. This is a significant constraint, but is not as restrictive as it seems at first sight; analytical techniques such as mass spectrometry and NMR can be used to determine 2D connectivity quickly and accurately and many diffraction problems consist of solving the structures of new crystalline forms of previously well-characterised molecules. That said, if the 2D connectivity supplied is wrong in any significant regard, it is unlikely that the structure determination will succeed and the crystallographer should always be alert to this possibility. If a new structure is being studied, one can frequently find significant molecular ‘chunks’ of the structure in existing crystal structures. In this regard, the Cambridge Structural Database (CSD) and its associated search interface ConQuest (Fig. 5.1) are essential tools [3], as is a good molecular modelling program with which one can edit retrieved structures to delete unwanted atoms or add new ones.
In the construction of a molecular model, one generally does not have to worry about the exact positions of hydrogen atoms, as their individual contribution to the scattering from any given point of the unit cell is relatively small. However, collectively, their contribution to the overall scattering can be significant and they should be included in the molecular model whenever possible.
Once a model of the molecule being studied has been constructed, it should be carefully checked for chemical sense – this is discussed in more detail in a separate section below.
3 Molecular Volume
Many molecular modelling programs (e.g. Marvin [4]) will quickly calculate a volume for any given isolated molecule. These values can be extremely useful as the powder pattern indexing stages, as the experimentally determined unit cell volume should be able to accommodate a crystallographically sensible number of molecular units. If the value of V cell/V mol makes crystallographic sense, this is a very good indicator of the correctness of the solution, above and beyond whatever the indexing figures of merit suggest. In considering the calculation of V cell/V mol, it should be remembered that the V mol obtained for an isolated molecule cannot be used “as is”, as it does not take into account the crystal packing index. Fortunately, this is easily corrected by the use of an average crystal packing index of 0.7 i.e. V cell/(V mol ÷ 0.7).
The 18 Å rule [5], where the estimated molecular volume of an organic material is obtained by counting all the non-hydrogen atoms in the structure and multiplying this number by 18 Å3, is commonly used on account of its speed and simplicity. The figure of 18 Å3 per atom is an average value derived from a survey of crystal structures in the Cambridge Structural Database. In 2002, Hofmann introduced a more specific formula [6], again based on observed crystal structures, but with individual volume terms for each atom type present in the structure. The formula is given in simplified form below:
where the summation is over the x different atom types in the molecule, n i is the number of atoms of the ith type in the structure and v i is a volume contribution (in Å3, derived from the CSD) for the ith atom type. This method is very accurate and should be used in preference to the 18 Å3 rule.
4 Molecular Description
The 2D connectivity of a molecule has to be translated into a 3D description for use in global optimization. This is generally a two-stage process: (1) perform the 2D to 3D conversion (if required), and (2) convert the 3D description into a format suitable for the optimisation program. The 2D to 3D conversion can be performed in a variety of ways but is generally done with a molecular modeling program. The specifics of such programs lie outside the scope of this summary, but regardless of the method used, the output structure should be checked carefully against the expectation values before use (see Sect. 5.6). The output structure will have atomic coordinates in a Cartesian frame and may or may not retain the explicit connectivity information. For global optimisation, were this collection of N atoms to be optimised as freely moving objects, there would be 3N parameters. However, to do so would neglect the fact that we actually know a great deal about the molecular geometry [7]. Any two directly bonded atoms in a molecular structure sit at well-defined distances from one another, and these distances are not greatly influenced by the environment of the crystal structure. As such, a bond length may generally be considered to be a fixed entity and not one that requires to be optimised. Bond angles created by three connected atoms are similarly well-defined, mainly by the molecular environment as opposed to the crystallographic one. They are ‘softer’ than bond lengths (in that it takes considerably less energy to induce a deviation from the value found in an isolated molecule) but may generally be considered to be fixed entities that do not require to be optimised. Bond torsion angles (defined as the angle between two bonds A–B and C–D, viewed along a common bond B–C), are extremely ‘soft’ in comparison to bond lengths and angles, as changing the bond torsion changes only non-bonded contact distances. As such, they are considered to be flexible entities whose values cannot (in general) be assigned in advance and that must be parameters in the global optimisation.
Using an internal coordinate description of the molecule [7, 8] is a convenient way of encoding this prior molecular knowledge and serves to reduce considerably the number of parameters that needs to be determined. A very simple example is shown in Fig. 5.2.
It is worth mentioning, however, that in some cases, it may be advantageous to allow some variation in bond lengths and angles during the optimisation process [9].
5 Other Sources of Prior Information
Although the values of flexible torsion angles in a structure cannot, in general, be specified in advance, that is not to say that one cannot infer probable values in advance of an optimisation. It should come as no surprise that molecular conformations within crystal structures populate low-lying areas of an energy surface and as such, it is possible to either (a) attempt to calculate in advance likely molecular conformations based on isolated molecule calculationsFootnote 1, or (b) examine ensembles of existing crystal structures in order to determine energetically favorable confirmations; see, for example [10]. The latter is conveniently achieved through the Conquest or Mogul [11] front ends to the Cambridge Structural Database and Fig. 5.3 shows the result obtained for the facile example of the amide bond. Few distributions are as well defined as the one shown in Fig. 5.3, but it still possible to use the distributions to influence the way in which torsional parameters in the optimization are varied. For problems involving multiple torsion angles, significant reductions in the search space can be achieve, with consequent benefits for both speed of solution and frequency of success [12].
For problems involving more than one fragment in the asymmetric unit, information about non-bonded contacts can also be also derived from the CSD using either the Conquest or Isostar front ends. Both distance and direction information can be obtained quickly and easily. Such information can be incorporated into the internal coordinate description of the fragments in order to reduce the degrees of freedom in the search. For the case shown in Fig. 5.4, the distance information can be included in the form of a virtual bond and the angle information (not shown) in the form of a virtual angle. Only one degree of freedom is saved (i.e. three positional for the Cl− ion are reduced to two internal) but the resultant internal degrees of freedom are very highly restrained, leading to a significant reduction in the search space [13].
6 Structure Checking
As mentioned earlier, when a model is constructed and ready for global optimisation against the measured diffraction data, it is wise to check to make sure that the model does not contain any bond lengths or angles that deviate too far from expected (previously observed) values. In this regard, the Mogul [14] front end to the CSD is an extremely valuable tool that makes this process straightforward. Figure 5.5 shows the results of a check performed on the molecular structure of carbamazepine as recorded in CSD refcode CBMZPN01. These results make it easy to identify any length, angle or torsion that deviates significantly from expectation values and that might need modification in the input model.
7 Summary
There are many different ways of constructing molecular models ready for global optimisation and this document has concentrated on arriving at the most accurate internal coordinate description. As highly accurate computational methods (such as DFT) become more widely used on the desktop, they will form an extremely valuable addition to the range of tools than can be brought to bear on this problem.
Notes
- 1.
Carrying such calculations to their logical periodic conclusion brings us to the domain of crystal structure prediction.
References
Mighell AD, Himes VL, Rodgers JR (1983) Space-group frequencies for organic-compounds. Acta Crystallogr A 39:737
Shankland K, David WIF (2002) Global optimisation. In: David WIF, Shankland K, McCusker LB, Baerlocher C (eds) Structure determination from powder diffraction data. Oxford University Press, Oxford
Allen FH (2002) The Cambridge structural database: a quarter of a million crystal structures and rising. Acta Crystallogr B58:380–388
Kempster CJ, Lipson H (1972) Rapid method of assessing number of molecules in unit-cell of an organic crystal. Acta Crystallogr B 28:3674
Hofmann DWM (2002) Fast estimation of crystal densities. Acta Crystallogr B 58:489
Shankland K (2004) Whole-molecular constraints – the Z-matrix unravelled. IUCR commission on crystallographic computing newsletter no. 4. http://www.iucr.org/__data/assets/pdf_file/0003/6384/iucrcompcomm_aug2004.pdf
Leach AR (2001) Molecular modelling: principles and applications, 2nd edn. Prentice-Hall, Harlow
Favre-Nicolin V, Cerny R (2004) A better FOX: using flexible modelling and maximum likelihood to improve direct-space ab initio structure determination from powder diffraction. Z Kristall 219:847
Shankland N, Florence AJ, Cox PJ, Wilson CC, Shankland K (1998) Conformational analysis of Ibuprofen by crystallographic database searching and potential energy calculation. Int J Pharm 165:107
Bruno IJ, Cole JC, Lommerse JPM, Rowland RS, Taylor R, Verdonk ML (1997) Isostar: a library of information about non-bonded interactions. J Comput Aided Mol Des 11:525–537
Florence AJ, Shankland N, Shankland K, David WIF, Pidcock E, Xu XL, Johnston A, Kennedy AR, Cox PJ, Evans JSO, Steele G, Cosgrove SD, Frampton CS (2005) Solving molecular crystal structures from laboratory X-ray powder diffraction data with DASH: the state of the art and challenges. J Appl Crystallogr 38:249
Nowell H, Attfield JP, Cole JC, Cox PJ, Shankland K, Maginn SJ, Motherwell WDS (2002) Structure solution and refinement of tetracaine hydrochloride from X-ray powder diffraction data. New J Chem 26:469
Bruno IJ, Cole JC, Kessler M, Luo J, Motherwell WDS, Purkis LH, Smith BR, Taylor R, Cooper RI, Harris SE, Orpen AG (2004) Retrieval of crystallographically-derived molecular geometry information. J Chem Inf Comput Sci 44:2133–2144
Acknowledgements
I am especially grateful to the staff of the CCDC in Cambridge, with whom we have explored the applicability to powder diffraction of many of the tools mentioned here.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media Dordrecht
About this paper
Cite this paper
Shankland, K. (2012). Organic Compounds. In: Kolb, U., Shankland, K., Meshi, L., Avilov, A., David, W. (eds) Uniting Electron Crystallography and Powder Diffraction. NATO Science for Peace and Security Series B: Physics and Biophysics. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-5580-2_5
Download citation
DOI: https://doi.org/10.1007/978-94-007-5580-2_5
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-5579-6
Online ISBN: 978-94-007-5580-2
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)