Open Access
Original Research

Structural Chemistry

, Volume 24, Issue 4, pp 1171-1184

Avoiding pitfalls of a theoretical approach: the harmonic oscillator measure of aromaticity index from quantum chemistry calculations

Authors

  • Marcin Andrzejak
    • K. Gumiński Department of Theoretical Chemistry, Faculty of ChemistryJagiellonian University
  • Piotr Kubisiak
    • K. Gumiński Department of Theoretical Chemistry, Faculty of ChemistryJagiellonian University
  • Krzysztof K. Zborowski
    • Department of Chemical Physics, Faculty of ChemistryJagiellonian University

DOI: 10.1007/s11224-012-0148-2

Abstract

The concept of the harmonic oscillator measure of aromaticity (HOMA) is based on comparing the geometrical parameters of a studied molecule with the parameters for an ideal aromatic system derived from bond lengths of the reference molecules. Nowadays, HOMA is routinely computed combining the geometries from quantum chemistry calculations with the experimentally based parameterization. Thus, obtained values of HOMA, however, are bound to suffer from inaccuracies of the theoretical methods and strongly depend on computational details. This could be avoided by obtaining both the input geometries and the parameters with the same theoretical method, but efficiency of the error compensation achieved in this way has not yet been probed. In our work, we have prepared a benchmark set of HOMA values for 25 cyclic hydrocarbons, based on the all core CCSD(T)/cc-pCVQ(T)Z geometries, and used it to investigate the impact of different choices of the exchange–correlation functionals and basis sets on HOMA, calculated against the experimentally based (HOMAEP) or the consistently calculated (HOMACCP) parameters. We show that using HOMAEP leads to large and unsystematic errors, and strong sensitivity to the choice of XC functional, basis set, and the experimental data for the reference geometry. This sensitivity is largely, although not completely attenuated in the consistent approach. We recommend the most suitable functionals for calculating HOMA in both approaches (HOMAEP and HOMACCP), and provide the HOMA parameters for 25 studied exchange–correlation functionals and two popular basis sets.

Keywords

Aromaticity HOMA Geometry optimization DFT Exchange–correlation functionals Coupled clusters

Introduction

The concept of aromaticity, introduced in 1855 by Hoffman [1] has been one of the most momentous ideas in organic chemistry. Geometric indices quantify the aromaticity utilizing the fact that in non-aromatic systems, the single and double bonds are clearly defined and have distinctly different lengths, whereas in aromatic systems the lengths of the nominally single and double bonds are similar or even equal to one another. Probably the most popular of the geometric indices of aromaticity is the harmonic oscillator measure of aromaticity (HOMA) index, introduced and developed by Krygowski et al. [27]. The value of HOMA for a n-member unsaturated ring is based on the lengths of individual bonds l i , according to the formula:
$$ {\text{HOMA}} = 1 - \frac{1}{n}\left( {\sum\limits_{i = 1}^{n} {\alpha_{i} (l_{i} - l_{i,opt}^{{}} )^{2} } } \right), $$
(1)
in which the proportionality constants α i and the optimum aromatic bond lengths \( l_{i,opt}^{{}} \) are the parameters that have to be independently determined for each pair of atoms (e.g., CC, CN, CO, NO) that form the bonds within the ring. Thus, the HOMA parameterization is based on carefully selected reference systems [3]. The optimum bond length between a given pair of atoms was originally defined as: \( l_{opt}^{{}} = (2l_{2} + l_{1} )/3 \), and the constant as \( \alpha = 2/\left[ {(l_{1} - l_{opt} )^{2} + (l_{2} - l_{opt} )^{2} } \right] \). The l 1 and l 2 are the lengths of a nominally single and a nominally double bond, respectively, that are present in the reference molecule(s). The constant α is designed to give HOMA = 0 for the Kekulé structure of a typical aromatic system and HOMA = 1 for the system with all bond lengths equal to the optimum value \( l_{opt}^{{}} \). The formula for the optimum bond length was first derived under the assumption that the force constants for l 2 is twice the force constant for the l 1. This assumption, however, is satisfied only approximately and the improved optimum bond length can be calculated from the formula: \( l_{opt}^{{}} = (\omega_{{}} l_{2} + l_{1} )/(1 + \omega_{{}} ) \), in which ω = w 2/w 1 denotes the ratio of force constants for the shorter and longer reference bonds, respectively [6]. The improved optimum bond lengths lead then to the modified values of the constants α.
One of the main advantages of HOMA is that apart from using just the bond length differences present in the molecule of interest, it also accounts for the differences between the average bond length for this molecule, and the optimum bond length for the ideally aromatic system. This is best observed when the definition of HOMA is rewritten as [5, 8]:
$$ \begin{aligned} {\text{HOMA}} = & 1 - {\text{EN}} - {\text{GEO}} \\ {\text{EN}} = & \alpha \cdot \left( {l_{opt} - l_{ave} } \right)^{2} \\ {\text{GEO}} = & \frac{\alpha }{n}\sum\limits_{i = 1}^{n} {\left( {l_{i} - l_{ave} } \right)^{2} } \\ \end{aligned} $$
(2)

The GEO component reflects the impact of the bond length differences (BLD) within the ring on the aromaticity, whereas the EN component is sensitive to changes in the average bond length. Thus, HOMA correctly predicts anti–aromaticity of e.g., cyclohexanehexone, whereas other popular geometry-based aromaticity descriptors like the Julg-François index [9] or the Bird index [10] fail spectacularly by classifying this system as a 100 % aromatic one. Note that the above formulas for HOMA are strictly equivalent to the original one (Eq. 1) only for hydrocarbons. For heterocyclic rings, the lengths of bonds involving atoms other than carbon have to be transformed to mimic the CC bond lengths of the same order [5]. This procedure, however, recovers the values of HOMA from the original formulation only for the force constant ratios ω identical for all pairs of atoms. When they are independently estimated for different pair of atoms, HOMA obtained from Eq. 2 are somewhat different from the original (Eq. 1) values. The discrepancies are nonetheless small and can usually be ignored, as decomposition (2) is needed mostly for specific interpretational purposes. It introduced, however, an intriguing novelty: the EN part is to be taken with the negative sign whenever the average bond length is shorter than l opt [8]. It may lead to HOMA > 1 provided that the GEO part is small (e.g., for symmetry reasons). This behavior is rather counterintuitive, and we will discuss it briefly while commenting on our results.

Originally, the HOMA index was designed to estimate the aromaticity of molecules based on geometries taken usually from crystallographic experiments. The values of HOMA were thus directly linked to measurements. This, however, made them vulnerable to errors inherent to applied experimental techniques, and related to interactions with environment. Moreover, the errors for the studied systems were likely to be different from errors for the reference molecules. The natural question in this context is what would the values of HOMA be, were they free from the environmental and experimental bias. Besides, geometries of many systems cannot be determined experimentally, especially if one is interested not only in the ground state properties, but also e.g., in the reactivity of a molecule in its excited state. In such cases, one usually resorts to quantum chemical calculations, which nowadays have become a standard way to determine molecular properties, including equilibrium geometries for both the ground state and the excited states. The rapidly growing computational power and the advent of new efficient theories and algorithms, led by the methods based on the density functional theory (DFT), have allowed for studying large and complex systems containing as many as several hundreds of atoms. However, the necessarily simplified treatments of electron correlation as well as other approximations routinely used in quantum chemistry are bound to affect the theoretical results. In many situations (e.g., the energies of reactions, activation barriers, or excitation energies), the theoretical results are surprisingly accurate, because most of the errors fortuitously cancel out. However, when the calculated quantities (e.g., bond lengths) are mixed with the experimental ones, the shortcomings of quantum chemical treatment are bound to resurface. Unfortunately, HOMA is routinely calculated in just such a way: the theoretically obtained bond lengths for a studied system are combined with the parameterization based on experimental geometries of the reference molecules [7, 1118]. One may have justified suspicions that HOMA computed in such a way would undergo strong changes with a change of the basis set, computational method, or even the exchange–correlation functional of DFT (a great variety of which have been recently developed and presented for general use). Such behavior of any quantitative descriptor of aromaticity is, of course, highly undesirable. One may expect, however, that this sensitivity to details of computational schemes would be reduced if a consistent theoretical treatment of both the studied system and the reference molecules is used. This approach offers a chance for systematic cancelation of errors of the quantum chemical calculations. The HOMA obtained in this way will be further referred to as HOMACCP (consistently calculated parameters) as opposed to HOMAEP, obtained with parameters based on the experimental geometries.

The sensitivity of HOMAEP to computational details of the geometry optimization can thus be anticipated, as well as its reduction for HOMACCP. The magnitude of these effects, however, cannot be easily predicted. In this paper, we would like to determine quantitatively the impact of different choices of computational methods of geometry optimizations on HOMA calculated in both outlined ways (HOMACCP and HOMAEP) for a group of compounds containing all-carbon unsaturated rings of varying sizes and degrees of aromaticity. The whole paper will be divided into two main sections. First, we will present the benchmark HOMA values for the selected unsaturated hydrocarbons obtained by means of the CCSD(T) computational scheme, which was reported to provide accuracy comparable to that of the best experiments [1921]. Subsequently, the benchmark will be used to test the performance of the DFT method, with a choice of 25 exchange–correlation functionals destined to various fields of chemistry, and two basis sets of different sizes. We will examine the consequences of using the original parameterization of HOMA, propose a new parameterization derived from the recent experimental geometry of the trans-1,3-butadiene, and demonstrate the changes brought about by using the consistent approach. The paper will be concluded by recommending the best functionals for the purpose of studying aromaticity of organic systems based on their geometries, and by providing the list of HOMA parameters (for the CC bonds) for all the studied DFT functionals and basis sets.

Results and discussion

The CCSD(T) study of trans-1,3-butadiene and selected hydrocarbons

In this section, we will focus initially on trans-1,3-butadiene, which is the original source of HOMA parameters for the CC bonds [3], and for which the experimental equilibrium bond lengths are known to a very good accuracy [22]. We will investigate the performance of the CCSD(T) method in predicting equilibrium geometries for this molecule, comparing the quantum chemical results with the experimental data. Similar analysis for benzene will be performed in “DFT calculations” section, where the ab initio results will be directly compared with the outcome of the DFT calculations.

Judging from the studies concerning the accuracy of ab initio methods for prediction of molecular equilibrium structures [1921, 23], the best method that is feasible for medium size molecules (up to 6-9 heavy atoms, depending on the symmetry of the system) appears to be the coupled-clusters singles and doubles, with perturbative inclusion of triples—CCSD(T), especially when combined with the cc-pVTZ or, preferably, with the cc-pVQZ basis sets. The mean error of this computational scheme (\( \bar{\Updelta } \)), determined for a set of molecules containing first and second row atoms, is less than 1 pm, with the maximum absolute deviation \( \left| {\Updelta_{\hbox{max} } } \right| \)=1.511 pm. When the core electrons are also correlated (all core CCSD(T)/cc-pCVQZ), the errors are further reduced (\( \bar{\Updelta } \) = 0.026 pm, and \( \left| {\Updelta_{\hbox{max} } } \right| \) = 0.706 pm). The latter computational scheme, however, is much more costly than the standard, frozen-core one, owing to both the higher number of active orbitals and the enlarged basis set, containing additional tight polarization functions for more flexible description of the core electrons.

For our ab initio calculations, we have selected the Dunning cc-pVXZ (X = D,T, and Q) basis sets [24]—the three consecutive members of the popular sequence of basis sets that allow for approaching the complete basis set limit by going to higher levels in the sequence. For the all core calculations, we have also used their dedicated counterparts: the cc-pCVXZ basis sets [24]. All the ab initio calculations have been carried out using MOLPRO 2010.1 [25] and Cfour [26] program packages.

Trans-1,3-butadiene and the HOMA parameters

1,3-butadiene was selected by Krygowski et al. [3] as the reference molecule to parameterize HOMA for the CC bonds. Since the trans isomer of the butadiene is more stable than the cis one, the experimental data refer to the former. One may argue that it is the cis isomer that should be used as the reference system for HOMA, as it more closely resembles a part of the benzene ring. The geometry of the cis isomer is difficult to determine experimentally, but as it is easily accessible theoretically, it could be used to parameterize HOMACCP. Such a parameterization, however, would lead to the values of HOMA that could not be directly compared with HOMAEP, the parameters for which are necessarily based on the geometry of the trans-1,3-butadiene. Since the main goal of this study is comparing the behaviors of HOMAEP and HOMACCP, we have decided to use the geometry of trans-1,3-butadiene throughout the whole study. It is interesting to study the changes of HOMA introduced by switching from the trans to the cis isomer of butadiene, but such an analysis is beyond the scope of the present paper, and will be addressed in the future.

The optimized CC bond lengths, the bond length difference (BLD) and the HOMA parameters l opt and α are displayed in Fig. 1, and collected in Table 1. For every combination of the computational scheme and basis set, we have also estimated the force constant ratio ω in the way outlined by Cyrański et al. [6]. The energies were calculated for a series of geometries obtained from the optimized equilibrium structure by changing either the l 1 or l 2 by ±0.005, ±0.010, and ±0.050 Å. Second order polynomial fits provided the approximate relations El 1 ) or El 2 ) and yielded the force constants w for both types of bonds. Owing to cancelation of errors, this simple procedure can be expected to provide good estimates for the ratios w 2/w 1. Thus obtained values of ω are also included in Table 1.
https://static-content.springer.com/image/art%3A10.1007%2Fs11224-012-0148-2/MediaObjects/11224_2012_148_Fig1_HTML.gif
Fig. 1

The CC bond lengths in trans-1,3-butadiene optimized at the CCSD(T) level of theory using the cc-pVXZ and cc-pCVXZ basis sets (X = D,T, and Q), and the respective HOMA parameters. Triangles denote the frozen-core results (CCSD(T)/cc-pVXZ), and diamonds represent the correlated core ones (all core CCSD(T)/cc-pCVXZ). Dashed lines denote the experimental values based on Ref. [22], and the respective HOMA parameters. Dotted lines represent analogous data from Ref. [3]

Table 1

The HOMA parameters (l opt , α) derived from the lengths of the CC bonds (l 1, l 2) and the related force constant ratios (ω) for trans-1,3-butadiene, as calculated at the CCSD(T) level of theory

 

l 1 [Å]

l 2 [Å]

ω

l opt [Å]

Δl [Å]

α

CCSD(T)/cc-pVDZ

1.4726

1.3585

1.660

1.4014

0.1141

289.5

All core CCSD(T)/cc-pCVDZ

1.4701

1.3558

1.659

1.3988

0.1142

288.8

CCSD(T)/cc-pVTZ

1.4610

1.3439

1.679

1.3876

0.1170

274.4

All core CCSD(T)/cc-pCVTZ

1.4581

1.3407

1.684

1.3844

0.1174

278.5

CCSD(T)/cc-pVQZ

1.4585

1.3412

1.682

1.3849

0.1173

273.0

All core CCSD(T)/cc-pCVQZ

1.4555

1.3381

1.684

1.3818

0.1174

272.8

Exp. olda

1.467

1.349

2.000

1.3883

0.118

258.5

Exp. newb

1.454 (1)

1.338 (1)

1.684

1.381 (1)

0.116 (2)

278 (8)

aRef. [3]

bRef. [22]

The calculated bond lengths are compared with two sets of the experimental ones. The first set comes from the electron diffraction experiment [27] and was selected by Krygowski [3] as reference to parameterize HOMA for the CC bonds. The other set comes from a recent paper of Craig et al. [22], in which the authors reported the equilibrium bond lengths (r e ) with the accuracy of 0.001 Å. The equilibrium bond lengths were obtained from the measured rotational constants of various deuterated butadienes, corrected for the influence of zero point vibrations. These bond lengths are by over 0.01 Å shorter (l 2 = 1.338 Å and l 1 = 1.454 Å) than those used by Krygowski (l 2 = 1.349 Å and l 1 = 1.467 Å). We do not suggest that the new bond lengths should be used in the classical calculations of HOMA, in which the index is based on experimental bond lengths of the studied molecules (typically obtained in crystallographic studies). The original parameterization may in such cases give better results, owing to favorable compensation of experimental or environmental errors. Quantum chemical calculations, however, yield directly the equilibrium bond lengths, so the new experimental data should be more appropriate for assessing the quality of the theoretical results. Analysis of the calculated CC bond lengths shows that it is indeed the case. The “old” bond lengths are best reproduced in the least accurate calculations, which employ the cc-pVDZ basis set. The calculated bond lengths, however, have decreased significantly when the basis set has been improved, eventually converging to the “new” experimental values. The agreement is nearly perfect for the all core CCSD(T)/cc-pCVQZ results.

The CCSD(T)/cc-pVQZ calculations have yielded the lengths of both kinds of CC bonds that are almost uniformly overestimated by approximately 0.004 Å, which results in similarly overestimated l opt , but the BLD is still of the same quality as the all core- CCSD(T)/cc-pCVQZ value. Since the error in the CC bond lengths seems to be nearly independent of the bond order, analogous behavior of the CCSD(T)/cc-pVQZ (similar overestimation of the average CC bond length, good reproduction of the BLD) can be expected also for other unsaturated hydrocarbons. In such a case, almost complete cancelation of errors can be expected while calculating the HOMACCP values, and so they can be regarded as equivalents to HOMA based on the accurate experimental equilibrium bond lengths. The same argument holds for the CCSD(T)/cc-pVTZ and the all core- CCSD(T)/cc-pCVTZ values: again, the errors for l 1 and l 2 are very similar (somewhat larger than for the QZ basis sets), and the BLD is almost as accurate as that obtained with the quadruple-ζ basis set.

Model compounds—benchmark results

In this section, we will analyze the values of HOMA computed for the test set of 25 cyclic hydrocarbons of varying aromaticity, displayed in Chart 1. The molecules were assumed to be planar by imposing the symmetry constraints (C2v or Cs), with obvious exceptions for compounds 1, 8, 24, and 25, for which planarization would be highly unfavorable energetically. The input bond lengths have been obtained with the CCSD(T) method in both the frozen-core and all core versions. The quadruple-ζ correlation consistent basis sets were used. They were replaced with their smaller, triple-ζ counterparts whenever the molecule proved too large. The accuracy of the cc-pVDZ results was deemed insufficient and they were excluded from further use in this study. The values of HOMACCP (obtained with the consistently calculated reference parameters) are listed in Table 2. A glance at the results allows one to conclude that HOMACCP are only weakly dependent on the choice of the basis set. In particular, the CCSD(T)/cc-pVXZ and the all core CCSD(T)/cc-pCVXZ results are very close to each other for both values of X (T or Q). The benchmark values will be chosen in the following fashion: the quadruple-ζ results will be preferred whenever available, and of those the potentially more accurate all core ones. Otherwise, we will select the all core CCSD(T)/cc-pCVTZ values. Thus created benchmark set will be applied to assess the performance of the selected XC functionals used in DFT calculations.
https://static-content.springer.com/image/art%3A10.1007%2Fs11224-012-0148-2/MediaObjects/11224_2012_148_Sch1_HTML.gif
Chart 1

Labels for the test set of aromatic hydrocarbons

Table 2

HOMACCP for the selected unsaturated hydrocarbons (labeled according to Chart 1) obtained from geometries optimized at the CCSD(T) level of theory

Molekule nr

HOMACCP

Frozen-core cc-pVTZ

All core cc-pCVTZ

Frozen-core cc-pVQZ

All core cc-pCVQZ

1

−1.106

−1.093

−1.082

−1.069

2

0.772

0.773

0.765

0.764

3

0.803

0.805

0.799

0.800

4

0.155

0.155

0.160

0.161

5

0.396

0.398

0.400

0.399

6

−0.421

−0.418

−0.408

−0.408

7

−0.11

−0.106

  

8

0.055

0.054

  

9

−0.467

−0.464

  

10

−0.569

−0.567

  

11

−0.242

−0.239

  

12

0.65

0.651

  

13

0.695

0.696

  

14

−0.122

−0.12

−0.122

 

15

0.091

0.093

0.090

 

16

0.973

0.973

0.973

0.973

17

0.965

0.965

0.966

0.966

18

0.938

0.937

  

19

0.574

0.573

  

20

0.084

0.082

0.093

 

21

−0.234

−0.238

  

22

−0.082

−0.087

  

23

0.013

0.010

0.012

 

24

−0.589

−0.589

  

25

−0.319

−0.318

−0.315

 

The values of HOMA that were included in the benchmark set are marked by the bold print

Note that the results seem to be well saturated with basis set already at the triple-ζ level even though the bond lengths are not. It shows that substantial compensation of errors does take place while computing HOMACCP, as envisaged in the preceding chapter. It also indicates that the errors in the CC bond lengths calculated at the CCSD(T) level of theory are approximately transferable, regardless of the bond order, and of the size of the molecule. That the largest differences occur for the least aromatic systems is rather easily understandable, as HOMA is based on the squared differences between bond lengths. The impact of the errors in bond lengths is thus the more severe, the farther a bond length deviates from the optimum value (l opt ), and the larger the BLDs are in a studied molecule.

DFT calculations

Choice of the exchange–correlation functionals

In the Kohn–Sham formulation of the density functional theory (KS-DFT) the computationally demanding direct solution of the electronic Schrodinger equation is replaced by solving a system of equations for non-interacting electrons defined to have the same one-electron density as the true system. Such calculations are much shorter than the traditional direct approach, and thus, the boundaries of applicability of (non-semiempirical) quantum chemical calculations has been moved from several tens to several hundreds of heavy atoms. KS-DFT provides a way to incorporate dynamic electron correlation into the one-electron model (or single-determinant wavefunction), previously characteristic for the Hartree–Fock scheme, in which all the Coulomb correlation of electrons was neglected. However, all the subtleties of the correlated motion of electrons have to be introduced in the KS-DFT through a complicated exchange–correlation (XC) functional, the exact form of which is unknown. Much of modern DFT research is therefore devoted to developing approximations to the XC functional, which are intended to give more and more accurate results. Unfortunately, no single systematic approach for developing the exact functional currently exists, and so hundreds of different functionals have been proposed, leaving the potential user at a loss as to which one would be most suitable for a particular task. An ideal functional would, of course, be well suited to all applications in chemistry and physics. Such functionals, however, are not likely to be discovered in the foreseeable future. Most of the existing functionals are more or less directed toward increased accuracy in a particular field (e.g., main group thermochemistry, barrier heights, or electronic spectroscopy) at the expense of deteriorated performance in calculating other properties.

For our study, we have selected functionals representing each of the levels of approximation (or rungs of the Jacob’s ladder [28]). The SVWN [29, 30] functional was chosen mostly for comparative purposes to emphasize the improvements introduced at the higher rungs. For GGA functionals (the second rung), we selected BLYP [3133], PBE [34, 35], and HCTH [3638]. From the third rung (the meta-GGA functionals), we include TPSS [39], τ-HCTH [40], and M06-L [41]. The fourth rung (the hybrid functionals) is most strongly represented, as the functionals here may contain different admixtures of non-local exchange, which significantly modifies their performance. Here, we have chosen the following functionals: TPSSh [39] (10 % of non-local exchange), B97-1 [38] (19 %), B3LYP [30, 32, 42] (21 %), PBE0 [43], PBEh [44], and ωPBEh [45] (25 % each), M06 [46] (27 %), BMK [47] (42 %), BHandHLYP [48] (50 %), M06-2X [46] (54 %), and M06-HF [46] (100 %). We also consider the recently developed range-corrected functionals, for which the admixture of non-local exchange varies with the interelectronic distance r. This feature is intended to improve the incorrect long-distance behavior of the approximate XC functionals. Thus, we have also CAM-B3LYP [49] (19–60 % of non-local exchange), LC-PBE [50] (0–100 %), LC-ωPBE [51] (25–100 %), ωB97 [52] (0–100 %), and ωB97X [52] (19–100 %). Finally, we include two double-hybrid functionals (the fifth rung): B2PLYP [53] and its improved version mPW2PLYP [54], which were reported to provide considerably higher accuracy with respect to BLYP, TPSS, and B3LYP, when tested on the extensive G3 set of molecules [54]. All the DFT calculations have been performed using the popular Dunning DZP basis set [55, 56], and the def2-TZVPP basis set of the Karlshruhe group [57]. The former provides a reasonable compromise between accuracy and computational cost, whereas the latter gives results that for DFT calculations can be regarded as close to the complete basis set limit. The DFT calculations were performed using GAUSSIAN’09 [58]. The selected functionals are listed in Table 3, together with the respective HOMA parameters
Table 3

Selected exchange–correlation functionals and the respective HOMA parameters obtained for both basis sets used in our DFT calculations

   

DZP

def2-TZVPP

Rung

X [%]

ω

l opt [Å]

α

ω

l opt [Å]

α

SVWN

1

0

1.561

1.3857

409.9

1.608

1.3712

374.1

BLYP

2

0

1.635

1.4019

334.6

1.689

1.3873

304.6

PBE

2

0

1.608

1.3976

359.0

1.654

1.3846

329.4

HCTH

2

0

1.616

1.3906

357.9

1.663

1.3791

326.0

TPSS

3

0

1.641

1.3962

328.7

1.685

1.3835

305.9

τ-HCTH

3

0

1.625

1.3917

347.3

1.670

1.3791

320.8

M06-L

3

0

1.635

1.3855

330.8

1.677

1.3743

306.8

TPSSh

4 (HMa)

10

1.666

1.3920

305.0

1.711

1.3797

284.5

B97-1

4 (HGb)

19

1.689

1.3936

281.8

1.733

1.3811

265.3

B3LYP

4 (HG)

21

1.685

1.3914

287.5

1.737

1.3777

266.5

PBE0

4 (HG)

25

1.675

1.3870

292.4

1.720

1.3753

273.6

PBEh

4 (HG)

25

1.675

1.3869

291.9

1.721

1.3749

273.1

ωPBEh

4 (HG)

25

1.669

1.3871

296.9

1.713

1.3751

277.6

M06

4 (HM)

27

1.674

1.3860

288.9

1.727

1.3719

267.3

BMK

4 (HM)

42

1.739

1.3937

229.7

1.763

1.3807

226.8

M06-2X

4 (HM)

54

1.746

1.3867

240.3

1.793

1.3753

224.9

BHandHLYP

4 (HG)

50

1.743

1.3806

240.4

1.794

1.3683

226.6

CAM-B3LYP

4 (RC)

19–60

1.750

1.3860

241.3

1.804

1.3723

226.0

wB97

4 (RC)

0–100

1.800

1.3886

210.3

1.846

1.3772

197.8

wB97-X

4 (RC)

10–100

1.786

1.3868

221.6

1.837

1.3746

207.7

LC-ωPBE

4 (RC)

25–100

1.801

1.3835

210.9

1.851

1.3712

199.8

LC-PBE

4 (RC)

0–100

1.784

1.3755

216.9

1.832

1.3630

206.3

M06-HF

4 (HM)

100

1.846

1.3872

186.7

1.880

1.3757

181.9

B2PLYP

5

53

1.655

1.3927

292.7

1.701

1.3797

276.1

mPW2PLYP

5

55

1.667

1.3904

284.2

1.714

1.3775

267.8

HF

100

1.837

1.3788

182.7

1.893

1.3694

173.7

MP2

100

1.621

1.3960

305.8

1.656

1.3829

296.3

X denotes the content of non-local exchange in the functional

aHyper-meta-GGA functional

bHyper-GGA functional

for the CC bonds obtained with both basis sets chosen for the DFT calculations. These parameters have been used in this study to compute HOMACCP for the test set of molecules. The parameterizations derived in the simplified way (ω = 2) are available in the supplementary material.

HOMA for benzene

Before we embark on the statistical analysis of the performance of the DFT functionals for the model compounds, we would like to focus briefly on benzene, for which the high quality experimental equilibrium geometry is available [23]. The equilibrium CC bond length was established to be r e  = 1.391 ± 0.001 Å (the same accuracy as that for butadiene [22]). This value is in excellent agreement with the all core CCSD(T)/cc-pCVQZ value of 1.3918 Å, whereas the frozen-core version of CCSD(T) slightly overestimates the experimental bond length, yielding r e  = 1.3949 Å. Nonetheless, HOMACCP is the same for both versions of CCSD(T): 0.973, owing to the compensation of errors discussed above for butadiene. This value is also practically the same as the HOMA based solely on the experimental equilibrium bond lengths both for benzene and butadiene: 0.970 ± 0.012 (the uncertainty being estimated from the maximum experimental errors for the CC bond lengths in both molecules).

The accuracy of DFT is considerably lower. The errors due to approximations in the XC functionals and limited basis sets are especially noticeable when the HOMAs are calculated using the experimental parameterizations (HOMAEP). Figure 2 panel a shows HOMAEP for benzene computed with three sets of the experimentally based parameters: the original one taken from Krygowski [3], and two sets of parameters obtained from the equilibrium bond lengths of butadiene [22] using either the force constant ratio ω = 2 or ω = 1.684 (the all core CCSD(T)/cc-pCVQZ value). After a glance at thus obtained values of HOMAEP, it becomes obvious that they strongly depend both on the computational parameters (XC functional, basis set) selected for geometry optimization of benzene and on the choice of experimental geometry of the reference molecule. They are also sensitive to whether the simple (ω = 2) or improved (ω = 1.684) parameterization was used. The values of HOMAEP based on the original parameters proposed by Krygowski et al. [3] are generally too high with respect to the HOMA solely based on the experimental equilibrium bond lengths, as well as to the HOMACCP from the CCSD(T) calculations. When computed from geometries optimized with the def2-TZVPP basis set, they closely approach or even exceed unity, going as high as 1.012 for BHLYP, and 1.049 for LC-PBE. The values based on bond lengths optimized with the DZP basis set are slightly lower, varying between 0.827 (BLYP) and 0.998 (LC-PBE). On the other hand, the combination of the DZP geometries and the parameterization based on the equilibrium bond lengths of butadiene [22] lead to extremely low values of HOMA (going down to 0.619 for BLYP). Analogous results but based on the def2–TZVPP geometries are much more reasonable, especially when the improved parameterization (derived with ω = 1.682) is used, which also helps to somewhat reduce the dependence of HOMA on the choice of the XC functional. The HOMAs calculated in this way vary between 0.89 (BLYP) and 1.011 (LC-PBE).
https://static-content.springer.com/image/art%3A10.1007%2Fs11224-012-0148-2/MediaObjects/11224_2012_148_Fig2_HTML.gif
Fig. 2

HOMA based on DFT bond lengths for benzene as the input data, as calculated with parameters either derived from experimental bond lengths for trans-1,3-butadiene (HOMAEP) or from bond lengths optimized with the same XC functional and basis set as the geometry of benzene (HOMACCP)

The real stabilization of the results, however, are achieved by switching to the parameterizations based on the consistently optimized CC bond lengths of butadiene (the HOMACCP, as displayed in Fig. 2 panel b). First of all, variations of HOMA with respect to the choice of the XC functional are further attenuated. Here, the importance of using the calculated values of the force constant ratios ω must be emphasized, as they show a surprisingly strong dependence on the functional, ranging from 1.608 (SVWN) to 1.880 (M06HF). Using these values has reduced the HOMA dependence on the functional almost threefold with respect to HOMA calculated in the simplified approach (ω = 2). The sensitivity of the HOMACCP to the size of the basis set is also very small: going from the moderate DZP basis set to the large def2-TZVPP basis set brings about a uniform (for all functionals) lowering of HOMA by less that 0.01. Moreover, thus calculated values of HOMACCP are quite accurate regardless of the choice of the functional, the errors being attenuated to within the margin of 0.04 with respect to the experimentally based value of 0.970.

It is also worth a while to look closer at the cases of HOMAEP >1. For benzene, it means that the length of the CC bonds in the ring is smaller than the optimum length of the CC bond for the aromatic system (l opt ). In the energy terms, the systems with shorter bonds can be expected to be more stable, or more aromatic. Therefore, for such cases, HOMA was defined to be greater than one [8]. Such a situation was difficult, however, to understand on physical grounds. Therefore, HOMA > 1 was rather attributed to imperfect choice of the reference systems, or to inaccuracies of the experimental bond lengths for the studied systems and the reference molecules. In our study, it may stem also from incompatibility of the quantum chemistry results and the experimental data used to obtain the HOMA parameters: when the original parameterization have been used, HOMAEP > 1 have been observed for six XC functionals (SVWN, M06, BHandHLYP, CAM-B3LYP, LC-PBE, LC-ωPBE), and for HF, combined with the def2-TZVPP basis set. The values of HOMAEP computed with the new experimental parameterization (based on the equilibrium bond lengths) have exceeded unity only for the LC-PBE/def2-TZVPP method. HOMACCP, on the other hand, has not exceeded unity for any of the XC functionals and both basis sets used in our study. We may thus conclude that while butadiene itself seems to be an appropriate choice for the source of the HOMA parameters, the peculiarities of HOMAEP being larger than one are brought about by combining the experimental bond lengths for the reference system with the theoretically obtained bond lengths for the studied molecules.

Performance of DFT functionals—statistical analysis

Figures 3, 4 and 5 contain the statistical data for the selected DFT functionals. We have included also the Hartree–Fock results in the analysis, as this method is both computationally inexpensive and it provides a reference point for the functionals with high content of non-local exchange (even though in DFT functionals the non-local exchange is computed using the Kohn–Sham orbitals). Another non-DFT method that we included is MP2, because it offers a considerable increase of accuracy with respect to the HF scheme at a reasonable cost. In fact, it is somewhat less computationally demanding than the double-hybrid functionals (B2PLYP, and mPW2PLYP). It is thus prudent to compare the accuracy of MP2 with that of DFT. The mean signed errors (MSE), the mean absolute errors (MAE), and the maximum absolute errors (MaxAE), have been calculated with respect to the benchmark values of HOMA, collected in Table 2. Three sets of data have been analyzed, corresponding to three choices of parameterization: (a) the original parameterization of Krygowski et al.; (b) the parameterization based on the new experimental equilibrium bond lengths of butadiene and the force constant ratio ω = 1.682, as calculated at the all core CCSD(T)/cc-pCVQZ level of theory; (c) the parameters calculated from the consistently calculated bond lengths and the force constant ratios ω for trans-1,3-butadiene (Table 3). The results obtained using the original parameters seem to confirm the conclusions made in the case of benzene. The values of HOMA are overestimated for most of the functionals, and the errors are significantly larger for the larger basis set. These findings are not surprising as the original parameterization is based on the bond lengths from the electron diffraction experiment (r a ), and not on the equilibrium geometry of butadiene (r e ). It appears that quantum chemical results should not be used in combination with these parameters. Using the experimental equilibrium bond lengths [22] as the source for the HOMA parameters has led to reduction of the errors of HOMAEP, which are no longer systematically overshot. They are generally a little too low if based on the DZP geometries. For HOMAEP calculated using the def2-TZVPP geometries, however, no systematic errors can be observed: the MSE of HOMAEP are randomly positive or negative, while the absolute deviations for most functionals are noticeably smaller in comparison with the DZP results.
https://static-content.springer.com/image/art%3A10.1007%2Fs11224-012-0148-2/MediaObjects/11224_2012_148_Fig3_HTML.gif
Fig. 3

HOMAEP based on the original parameterization from Ref. [3]. Statistical analysis of DFT performance for the selected XC functionals and basis sets

https://static-content.springer.com/image/art%3A10.1007%2Fs11224-012-0148-2/MediaObjects/11224_2012_148_Fig4_HTML.gif
Fig. 4

HOMAEP based on the original parameterization from Ref. [22]. Statistical analysis of DFT performance for the selected XC functionals and basis sets

https://static-content.springer.com/image/art%3A10.1007%2Fs11224-012-0148-2/MediaObjects/11224_2012_148_Fig5_HTML.gif
Fig. 5

HOMACCP (parameters based on the optimized geometries of the reference molecule). Statistical analysis of DFT performance for the selected XC functionals and basis sets

The HF values of HOMA are substantially underestimated (too low aromaticities), which is in accordance with the well-known tendency of the HF method to over-localize the π-electrons and thus yield too high bond length differences (BLD) and too low polarizabilities [5961]. On the other hand, the MP2 values of HOMA are too high, which again corresponds to the frequently observed for this method overshot delocalization of the π-electrons, resulting in too low BLDs and overestimated polarizabilities, especially in the extended π-conjugated systems (e.g., oligoenes, oligothiophenes) [5961]. Out of the DFT functionals, only the local functional (SVWN) and the M06-HF one yielded worse results than HF. All the other functionals have outperformed HF by far, being also better than, or at least comparable with MP2.

The best functionals are TPSSh, B3LYP, BHandHLYP, CAM-B3LYP, and the two fifth rung functionals (B2PLYP and mPW2PLYP). For all of them, the maximum errors do not exceed 0.15, the mean absolute errors are less than 0.05, and the mean signed errors are in the range of –0.035 to 0.035. It appears that for good performance in geometry optimization the functional has to contain a moderate to medium content of non-local exchange.

Interestingly, out of the functionals from the first three rungs, the best performance (MSE ≈0, relatively low values of MAE and MaxAE) has been observed for PBE and TPSS, the two functionals that were created using the exact constraint satisfaction method, without any empirical fitting procedure [62].

Further improvement of the DFT results has been achieved using the consistently calculated parameters (listed in Table 3), which leads to the HOMACCP values. A distinct trend can be observed here, much as in the case of benzene. The non-hybrid (local, GGA, and meta-GGA) functionals yield the lowest values of HOMA and generally underestimate the aromaticity (MSE <0). This systematic error is reduced when some admixture of the non-local exchange appears in the functional. The best performance is observed for the PBE hybrids and the M06 functional (25 and 27 % of the non-local exchange, respectively). Further increase of the non-local contribution to exchange brings about an increase of HOMA, leading to positive values of MSE, and to elevated values of MAE. This trend does not hold for the double-hybrid functionals, however, owing to the presence of the non-local correlation component, which reduces the errors associated with the high content (over 50 %) of non-local exchange. Nonetheless, for all of the DFT functionals studied here, and the two ab initio methods included in the analysis, the values of MAE are lower than 0.13. Note that for HOMAEP (obtained using the new experimental parameters), the MAE of 0.13 was exceeded for six DFT functionals, as well as for HF and MP2. The effect of favorable compensation of errors is thus evident.

This is also the reason of the much reduced sensitivity of the HOMACCP to the size of the basis set. The results obtained from geometries optimized with the moderate DZP basis set are to be within a few percent identical to the results based on geometries optimized with the far better, larger, and more computationally demanding def2-TZVPP basis set. The changes are moderate even for the double-hybrid functionals, which are potentially the most sensitive to the size of the basis set, as they require not only the occupied orbitals (in the exchange part), but also make use of all the virtual orbitals (in the correlation part).

Detailed analysis of the errors for HOMACCP is not as straightforward as for HOMAEP, since they originate from differences in performance of a given theoretical method for the studied molecules and the reference ones. In the ideal case, in which the errors in bond lengths are independent of the bond order and the size of the molecule (which would result in exact BLDs, even in l ave were inperfect), the values of HOMACCP would be completely free from the errors of quantum chemical treatment. CCSD(T)/cc-pVQ(T)Z results are close to fitting in that picture. DFT geometries, however, satisfy neither of the above conditions: the BLD is usually underestimated, and the deviations depend on the size of the π-conjugated system. As a result, the compensation of errors in HOMACCP based on DFT geometries is incomplete, and functional dependent.

Conclusions

In our study, we have investigated the sensitivity of HOMA to the choice of computational methods used for optimizing molecular geometries. The values of HOMA have been computed using either the experimentally based parameterizations (HOMAEP)—the original one of Krygowski et al. [3] and a new one based on the recently reported equilibrium geometry of the trans-1,3-butadiene [22]—or using the parameters derived from geometry of the trans-1,3-butadiene optimized in the same way as the studied molecules (HOMACCP).

We have found out that the consistent approach strongly reduces the dependence of HOMA on the choice of computational method and basis set used for geometry optimization. The compensation of errors has been particularly good for the CCSD(T) method, for which the values of HOMACCP can be regarded as nearly error-free. For DFT calculations, the error cancelation is not so perfect, as the errors in the CC bond lengths of the unsaturated hydrocarbons (and especially the bond length difference between the nominally single and double bonds) depend on the size of the π-conjugated system in the way that is unique for every XC functional. Consequently, the errors of HOMACCP are still functional dependent. In particular, the MSE changes from the negative to positive values proportionally to the content of the non-local exchange in the hybrid functionals. The absolute errors are nevertheless small: even though PBE0 has been found perform better than other functionals in computing HOMACCP (MAE <0.04, MaxAE <0.1), the MAE is below 0.05 for a wide range of the XC functionals with small to medium admixture of non-local exchange (from TPSSh to CAM-B3LYP). This observation is of practical importance as it facilitates direct comparisons of HOMACCP obtained with different XC functionals that belong to this group. Moreover, since the errors seem to depend mostly on the content of exact exchange, one may speculate that any hybrid functional containing between 20 and 50 % of exact exchange should yield rather accurate values of HOMACCP. Another advantage of the consistent approach is a very strong reduction of HOMA sensitivity to the choice of the basis set. The values of HOMACCP obtained using the DZP basis set are of comparable accuracy as their counterparts based on much more computationally demanding calculations with the def2-TZVPP basis set.

Using the experimentally based parameters has resulted in considerable variations of the HOMAEP values, depending strongly on the choice of both the XC functional and the basis set used for geometry optimizations of the studied molecules. In addition, the HOMAEP are necessarily dependent on the selection of experimental data concerning the geometry of 1,3-butadiene. We have shown that the original parameterization successfully used for computing HOMA based on the crystallographic data is rather ill suited for using in combination with the quantum chemical results. Not only the HOMAEP obtained with this parameterization are considerably overestimated for nearly all of the studied XC functionals, but the errors are larger for the larger basis set (def2-TZVPP). These systematic, positive errors of HOMAEP have been eliminated using the parameterization based on the experimental equilibrium geometry of the reference system, which is by definition directly comparable with the quantum chemistry results. Using the new parameterization brought about considerable reduction of the errors of HOMAEP, especially for the results obtained with the def2-TZVPP geometries. Several hybrid functionals (TPSSh, B3LYP, BHandHLYP, CAM-B3LYP) and both double-hybrid ones (B2PLYP, mPW2PLYP) have yielded MAE below 0.05. The errors, however, have increased twofold or more when the triple-ζ basis set have been replaced by the DZP one. From among the GGA (and meta-GGA) functionals PBE and TPSS showed the best performance, with errors only slightly exceeding those for the hybrid and double-hybrid functionals.

In view of the above findings, we suggest using the HOMACCP when the input geometries are to be obtained by means of the quantum chemistry calculations. We are aware that aromaticity is not a simple, rigorously quantifiable property. On the other hand, the increased consistency and comparability of results within the framework of one aromaticity index, achieved through using the HOMACCP is a desirable quality. For convenience, we have included the ready-to-use parameters for the CC bonds for all the studied XC functionals and the two basis sets. In the following paper, the analogous sets of parameters will be given for other bonds frequently encountered in organic systems (CN, CO, NN, CP, CS, NO).

Acknowledgments

This research was supported in part by PL-Grid Infrastructure. The calculations were performed on Zeus: HP Cluster Platform of the Academic Computer Centre CYFRONET and on Supernova Cluster of the Wroclaw Centre for Networking and Supercomputing.

Open Access

This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

Supplementary material

11224_2012_148_MOESM1_ESM.doc (68 kb)
Supplementary material 1 (DOC 85 kb)

Copyright information

© The Author(s) 2012