# Statistical properties of spectra of chloronaphthalenes

## Abstract

We introduce spectral density distribution moments as molecular descriptors. We demonstrate that these descriptors correctly represent the molecular structure. To prove the correctness of the new descriptors, we consider the IR spectra of 76 chloronaphthalenes. We show that the consecutive substitutions of the hydrogen atoms by the chlorine atoms are reflected by a regular behavior of the descriptors and, consequently, that the descriptors may be used to construct classification diagrams. The diagrams allow for a clear distinction between the compounds containing different numbers of the substituents. In the diagrams, symbols corresponding to the isomers with a given number of chlorine atoms nearly overlap and form clusters. The clusters corresponding to different numbers of chlorine atoms are well separated and distributed in a regular way. The same kind of analysis we also use as a tool for checking the correctness of the calculated spectra.

### Keywords

Statistical spectroscopy Descriptors Molecular similarity DFT methods## 1 Introduction

Similarity of molecular structures is highly relevant to various realms of science such as toxicology, ecotoxicology, and pharmacology. The basic paradigm of quantitative structure-property/activity relationships (QSPR/QSAR) is that compounds with similar structure have similar properties. This implies a smooth transient behavior in the relation between structure and property/activity, i.e., for any small change in the structure, the magnitude of a physico-chemical property or biological activity changes smoothly rather than in an abrupt, in an all-or-none type, way. Recently, much attention has been paid to so-called “activity cliffs” [1], which imply that this smooth behavior may not always be appropriate. However, we simply point out here that exactly such discontinuous behavior would be expected to result from descriptors that do not encode the relevant molecular properties adequately. This behavior is, for instance, well known when trying to define geometrical reaction coordinates for complex reactions. The concept of molecular similarity is always complicated by ambiguities in the representation of molecular structure and in the definition of similarity. In the computation of molecular similarity, a large number of mathematical functions can be used to derive measures of similarity for a pair of molecules starting from the same set of structural descriptors [2, 3]. Quantitative molecular similarity analysis (QMSA) uses global molecular descriptors, such as 2D-fingerprints [4] or topological indices [5], to identify molecular similarity for property prediction and for environmental risk assessment. However, similarity measures that concentrate on the relevant regions of molecules, such as established 3D QSAR methods like comparative molecular field analysis (CoMFA) [6], are often more appropriate. These methods rely on spatially resolved, rather than global molecular similarity. They are potentially extremely powerful, but suffer from the disadvantage that they depend on the molecular conformation used, so that extensive conformational searches are necessary. This problem has been treated using conformationally weighted descriptors [7], which implies a dynamic description of flexible molecules. Methods are currently being developed [8] to use 3D information efficiently in drug design. A promising class of spatially resolved methods in computational pharmacology is derived from quantum-mechanical descriptions of molecules and of their potential interactions with the environment [9, 10, 11, 12].

Recently, we have proposed intensity distribution moments as molecular descriptors [13, 14, 15]. They are related to the shapes of molecular spectra. We have shown that using new descriptors one can distinguish nitriles from amides [16].

In the present work, we consider another kind of distribution moments—spectral density distribution moments. We demonstrate the usefulness of these descriptors using an example of the infrared spectra of 76 chloronaphthalenes.

Chloronaphthalenes have been used in industry for many years. They are toxic and of considerable interest in environmental studies [17]. Naphthalene IR spectrum, as well as spectra of other polyaromatic hydrocarbons, are also of great interest in astrophysics [18, 19].

We show that the new descriptors correctly represent the molecular structure—they clearly identify the number of chlorine atoms in the compounds. Moreover, we use statistical analysis as a tool for checking the correctness of the calculated spectra. We have used similar methodologies in astrophysics [20, 21]. We have shown that using methods of statistical spectroscopy it is possible to check the correctness of the existing classification of the stellar spectra. This methodology seems to be universal, and can be applied to many kinds of problems involving different classification schemes, as for example distinguishing between chaotic and periodic motions [22] or the classification of stars. We have also used this methodology to study the similarity of DNA sequences [23].

## 2 Theory

We show that spectral density distribution moments provide information also about the molecular structure. For this purpose we construct the classification diagrams (see the subsequent section).

## 3 Results and discussion

Compounds

r |
| r | |||
---|---|---|---|---|---|

0 | Naphthalene | 38 | 1,2,5,8-tetraCN | ||

1 | 1-CN | 39 | 1,2,6,7-tetraCN | ||

2 | 2-CN | 40 | 1,2,6,8-tetraCN | ||

3 | 1,2-diCN | 41 | 1,2,7,8-tetraCN | ||

4 | 1,3-diCN | 42 | 1,3,5,7-tetraCN | ||

5 | 1,4-diCN | 43 | 1,3,5,8-tetraCN | ||

6 | 1,5-diCN | 44 | 1,3,6,7-tetraCN | ||

7 | 1,6-diCN | 45 | 1,3,6,8-tetraCN | ||

8 | 1,7-diCN | 46 | 1,4,5,8-tetraCN | ||

9 | 1,8-diCN | 47 | 1,4,6,7-tetraCN | ||

10 | 2,3-diCN | 48 | 2,3,6,7-tetraCN | ||

11 | 2,6-diCN | 49 | 1,2,3,4,5-pentaCN | ||

12 | 2,7-diCN | 50 | 1,2,3,4,6-pentaCN | ||

13 | 1,2,3-triCN | 51 | 1,2,3,5,6-pentaCN | ||

14 | 1,2,4-triCN | 52 | 1,2,3,5,7-pentaCN | ||

15 | 1,2,5-triCN | 53 | 1,2,3,5,8-pentaCN | ||

16 | 1,2,6-triCN | 54 | 1,2,3,6,7-pentaCN | ||

17 | 1,2,7-triCN | 55 | 1,2,3,6,8-pentaCN | ||

18 | 1,2,8-triCN | 56 | 1,2,3,7,8-pentaCN | ||

19 | 1,3,5-triCN | 57 | 1,2,4,5,6-pentaCN | ||

20 | 1,3,6-triCN | 58 | 1,2,4,5,7-pentaCN | ||

21 | 1,3,7-triCN | 59 | 1,2,4,5,8-pentaCN | ||

22 | 1,3,8-triCN | 60 | 1,2,4,6,7-pentaCN | ||

23 | 1,4,5-triCN | 61 | 1,2,4,6,8-pentaCN | ||

24 | 1,4,6-triCN | 62 | 1,2,4,7,8-pentaCN | ||

25 | 1,6,7-triCN | 63 | 1,2,3,4,5,6-hexaCN | ||

26 | 2,3,6-triCN | 64 | 1,2,3,4,5,7-hexaCN | ||

27 | 1,2,3,4-tetraCN | 65 | 1,2,3,4,5,8-hexaCN | ||

28 | 1,2,3,5-tetraCN | 66 | 1,2,3,4,6,7-hexaCN | ||

29 | 1,2,3,6-tetraCN | 67 | 1,2,3,5,6,7-hexaCN | ||

30 | 1,2,3,7-tetraCN | 68 | 1,2,3,5,6,8-hexaCN | ||

31 | 1,2,3,8-tetraCN | 69 | 1,2,3,5,7,8-hexaCN | ||

32 | 1,2,4,5-tetraCN | 70 | 1,2,3,6,7,8-hexaCN | ||

33 | 1,2,4,6-tetraCN | 71 | 1,2,4,5,6,8-hexaCN | ||

34 | 1,2,4,7-tetraCN | 72 | 1,2,4,5,7,8-hexaCN | ||

35 | 1,2,4,8-tetraCN | 73 | 1,2,3,4,5,6,7-heptaCN | ||

36 | 1,2,5,6-tetraCN | 74 | 1,2,3,4,5,6,8-heptaCN | ||

37 | 1,2,5,7-tetraCN | 75 | OctaCN |

The vibrational spectra we obtained from systematic high level DFT (Density Functional Theory) calculations. A hybrid B3LYP functional and 6-311++G** basis were used as implemented in the Gaussian 03 code [24]. A careful geometry optimization was performed prior to calculation of vibrational spectra for each compound. In some cases an initial guess of planar chloronaphthalene geometry led to transition states or saddle points, so in order to avoid these artefacts a lower, non-planar initial guess geometries were used. Thus, no calculated minima had imaginary frequencies.

Figure 4 shows spectra of 6 compounds: (a) naphthalene \(r=0\), (b) 1 chlorine atom \(r=1\), (c) 2 chlorine atoms \(r=3\), (d) 4 chlorine atoms \(r=27\), (e) 7 chlorine atoms \(r=73\), (f) 8 chlorine atoms \(r=75\).

These two facts (replacing high frequency C–H stretching modes by low frequency C–Cl stretching modes and replacing oop modes for C–H by oop modes for C–Cl) are also related to the shift of \(M^{\prime }_{\rho ,2}\) to lower values when the number of chlorine atoms increases. The dispersion of the frequencies becomes smaller for the spectra of compounds with larger number of chlorines.

The unexpected order of \(M^{\prime \prime }_{\rho ,3}\) descriptors [\(M^{\prime \prime }_{\rho ,3}\)(7 Cl) \(<\)\(M^{\prime \prime }_{\rho ,3}\)(6 Cl)] may be explained by the fact that in 7 Cl compounds only two non-equivalent systems are expected, namely these in which the only hydrogen atom is located either in position (1) (alpha) or in (2) (beta). Thus, the symmetry of this distribution is relatively high. On the other hand, in 6 Cl derivatives there are 10 different combinations of 2 hydrogen atoms and these spectra have less symmetric distribution of frequencies.

It appears that the values of moments calculated for the correct spectra, for a given number of chlorine atoms, do not depend on the distribution of these atoms up to several significant figures. However, this is not the case if spectra contain some errors. Though we still do not have a formal proof of this property, it may be used for the detection of errors in the calculated spectra. This may be seen by comparing Figs. 3 and 5. In Fig. 3 all the spectra are correct. In Fig. 5 the correct spectra are for \(s<4\) i.e. \(r=0,1,\ldots ,26\), and they contain errors for \(s\ge 4\) i.e. \(r=27,28,\ldots ,75\), where \(s\) is the number of chlorine atoms in the compounds. In particular, the irregular oscillations may be seen for \(M_{\rho ,1}\) and for \(M^{\prime \prime }_{\rho ,3}\) in top panels of Fig. 5.

Summarizing, spectral density distribution moments clearly identify the number of chlorine atoms in the molecules. They can be used as a tool for checking the correctness of the spectra which are used for their creation.

## Notes

### Acknowledgments

The contributions of DBW and of TP were supported, respectively, by the Alexander von Humboldt Foundation and by the Polish Ministry of Science and Higher Education (grant no. DS/8430-4-0171-1).

### References

- 1.M.T. Sisay, L. Peltason, J. Bajorath, J. Chem. Inf. Model
**49**, 2179 (2009)CrossRefGoogle Scholar - 2.
*Advances in Molecular Similarity*, vol. 2, ed. by R. Carbo-Dorca, P.G. Mezey (JAI Press, Stamford, CN, 1998)Google Scholar - 3.M. Johnson, G.M. Maggiora,
*Concepts and Applications of Molecular Similarity*(Wiley, New York, 1990), p. 393Google Scholar - 4.
*Handbook of Chemoinformatics*, ed. by J. Gasteiger, (Wiley-VCH, Weinheim, 2003)Google Scholar - 5.
*Topological Indices and Related Descriptors in QSAR and QSPR*, ed. by J. Devillers, A.T. Balaban, (Gordon and Breach Science Publishers, The Netherlands, 1999)Google Scholar - 6.R.D. Cramer, D.E. Patterson, J.D. Bunce, J. Am. Chem. Soc.
**110**, 5959 (1988)Google Scholar - 7.J.S. Duca, A.J. Hopfinger, J. Chem. Inf. Comput. Sci.
**41**, 1367 (2001)Google Scholar - 8.T. Clark, J. Comput.-Aided Mol. Design
**24**, 605 (2010)CrossRefGoogle Scholar - 9.M. Brustle, B. Beck, T. Schindler, W. King, T. Mitchell, T. Clark, J. Med. Chem.
**45**, 3345 (2002)Google Scholar - 10.A.H. Göller, M. Hennemann, J. Keldenich, T. Clark, J. Chem. Inf. Model
**46**, 648 (2006)CrossRefGoogle Scholar - 11.T. Clark, K.G. Byler, M.J. de Groot, Proceedings of the International Beilstein Workshop, Bozen, Italy, May 15–19, 2006, in Molecular Interactions - Bringing Chemistry to Life, (Logos Verlag, Berlin, 2008) pp. 129–146Google Scholar
- 12.T. Clark, M. Ford, J. Essex, W. G. Richards, D. W. Ritchie, Proceedings of EuroQSAR, 2004,
*QSAR and Molecular Modelling in Rational Design of Bioactive Molecules*ed. by E. Aki, I. Yalcin, (CADDDS in Turkey, Ankara, 2006) pp. 536–537Google Scholar - 13.D. Bielińska-Wa̧ż, P. Wa̧ż, S.C. Basak, Eur. Phys. J. B
**50**, 333 (2006)CrossRefGoogle Scholar - 14.D. Bielińska-Wa̧ż, P. Wa̧ż, S.C. Basak, J. Math. Chem.
**42**, 1003 (2007)Google Scholar - 15.D. Bielińska-Wa̧ż, P. Wa̧ż, J. Math. Chem.
**43**, 1287 (2008)Google Scholar - 16.D. Bielińska-Wa̧ż, W. Nowak, Ł. Pepłowski, P. Wa̧ż, S.C. Basak, R. Natarajan, J. Math. Chem.
**43**, 1560 (2008)Google Scholar - 17.J. Falandysz, Food Addit. Contam
**20**, 995 (2003)CrossRefGoogle Scholar - 18.D.M. Hudgins, S.A. Sandford, L.J. Allamandola,
*Astronomical Infrared Spectroscopy. Future Observational Directions*, vol. 41. Conference held at the University of Calgary, Alberta, June 16–19, 1992, ed. by S. Kwok (Astronomical Society of the Pacific, San Francisco, CA, 1993), p. 275Google Scholar - 19.S.R. Langhoff, J. Phys. Chem.
**100**, 2819 (1996)Google Scholar - 20.P. Wa̧ż, D. Bielińska-Wa̧ż, A. Pleskacz, A. Strobel, Acta Phys. Pol. B
**39**, 1993 (2008)Google Scholar - 21.P. Wa̧ż, D. Bielińska-Wa̧ż, A. Strobel, A. Pleskacz, Acta Astron.
**60**, 283 (2010)Google Scholar - 22.P. Wa̧ż, D. Bielińska-Wa̧ż, Acta Phys. Pol. A
**116**, 987 (2009)Google Scholar - 23.D. Bielińska-Wa̧ż, W. Nowak, P. Wa̧ż, A. Nandy, T. Clark, Chem. Phys. Lett.
**443**, 408 (2007)Google Scholar - 24.M.J. Frisch, G.W. Trucks, H.B. Schlegel, G.E. Scuseria, M.A. Robb, J.R. Cheeseman, J.A. Montgomery Jr, T. Vreven, K.N. Kudin, J.C. Burant, J.M. Millam, S.S. Iyengar, J. Tomasi, V. Barone, B. Mennucci, M. Cossi, G. Scalmani, N. Rega, G.A. Petersson, H. Nakatsuji, M. Hada, M. Ehara, K. Toyota, R. Fukuda, J. Hasegawa, M. Ishida, T. Nakajima, Y. Honda, O. Kitao, H. Nakai, M. Klene, X. Li, J.E. Knox, H.P. Hratchian, J.B. Cross, V. Bakken, C. Adamo, J. Jaramillo, R. Gomperts, R.E. Stratmann, O. Yazyev, A.J. Austin, R. Cammi, C. Pomelli, J.W. Ochterski, P.Y. Ayala, K. Morokuma, G.A. Voth, P. Salvador, J.J. Dannenberg, V.G. Zakrzewski, S. Dapprich, A.D. Daniels, M.C. Strain, O. Farkas, D.K. Malick, A.D. Rabuck, K. Raghavachari, J.B. Foresman, J.V. Ortiz, Q. Cui, A.G. Baboul, S. Clifford, J. Ciosłowski, B.B. Stefanov, G. Liu, A. Liashenko, P. Piskorz, I. Komaromi, R.L. Martin, D.J. Fox, T. Keith, M.A. Al-Laham, C.Y. Peng, A. Nanayakkara, M. Challacombe, P.M.W. Gill, B. Johnson, W. Chen, M.W. Wong, C. Gonzalez, J.A. Pople (Gaussian, Inc., Wallingford CT, 2004). http://www.gaussian.com/g_misc/g03/citation_g03.htm
- 25.S.N. Singh, H. Bhatti, R. Singh, Spectrochim. Acta Part A Mol. Spectrosc.
**985**, 34 (1978)Google Scholar - 26.W.X. Huang, J.M. White, J. Phys. Chem. B
**108**, 5060 (2004)CrossRefGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.