Introduction

Over the past 30 years, NDDO-type [1, 2] semiempirical methods have evolved steadily. The earliest of these methods was MNDO [3, 4], which itself was a major advance over even earlier non-NDDO methods such as MINDO/3 [5]. The main advantage of MNDO over earlier methods was that the values of the parameters were optimized to reproduce molecular rather than atomic properties. When it first appeared, MNDO was immediately popular because of its increased accuracy, but, with the passage of time, various limitations were found, among the most important of which was the almost total absence of a hydrogen bond. As hydrogen bonding is essential to life, this particular fault essentially precluded MNDO being used in modeling biochemistry.

In 1985 an attempt, AM1 [6], was made to improve MNDO by adding a stabilizing Gaussian function to the core-core interaction to represent the hydrogen bond. Despite the fact that this was an over-simplification of a very complicated phenomenon, the overall effect was similar, and for the first time NDDO methods gave a good, albeit limited, model of hydrogen bonding.

In the course of the next several years, improvements were made to the method of parameter optimization. The result of this was the PM3 method [710], which culminated in the parameterization of all the elements in the main group in 2004 [11]. At the same time, various changes to the original set of approximations used in MNDO were proposed, the most important of which were the addition of d-orbitals to main-group elements [12, 13] and the introduction of diatomic parameters. Work started on the transition metals, and parameters for some of these have been reported [14, 15]. More recently, parameter sets tailored to reproduce specific phenomena such as the binding energy of nucleic acid base pairs [16], iron complex catalyzed hydrogen abstraction [17], phosphatase-catalyzed reaction barriers [18], and the redox properties of iron containing proteins [19] have been developed.

Because of the way advances in NDDO developments occurred, in terms of the modifications of the approximations and the extensions to specific elements or groups of elements, there has been an inevitable lack of consistency. The aim of the current work was three-fold: to investigate the incorporation of some of the reported modifications to the core-core approximations into the NDDO methodology; to carry out a systematic global parameter optimization of all the main group elements, with emphasis on compounds of interest in biochemistry; and to extend the methodology by performing a restricted optimization of parameters for the transition metals. This resulted in the development of a new method, consisting of the final set of approximations used and the optimized parameters. This method will be referred to as parametric method number 6, or PM6. The name PM6 was chosen to avoid any confusion with two other unpublished methods, PM4 and PM5.

Theory

Despite the apparent complexity of semiempirical methods, there are only three possible sources of error: reference data may be inaccurate or inadequate, the set of approximations may include unrealistic assumptions or be too inflexible, and the parameter optimization process may be incomplete. In order for a method to be accurate, all three potential sources of error must be carefully examined, and, where faults are found, appropriate corrective action taken.

Reference data

In contrast to earlier methods, in which reference data was assembled by painstakingly searching the original literature, the current work relies heavily on the large compendia of data that have been developed in recent years. The most important of these are the WebBook [20], for thermochemistry, and the Cambridge Structural Database [21] (CSD), for molecular geometries.

During the early stages of the current work, consistency checks were performed to ensure that erroneous data were not used. These checks revealed many cases in which the calculated heats of formation were inconsistent with the reference heats of formation reported in the NIST database. On further checking, many of these reference data were also found [22, 23] to be inconsistent with other data in the WebBook. In those cases where there was strong evidence of error in the reference data, the offending data were deleted, and the webbook updated [24].

For molecular geometries, gas phase reference data are preferred, but in many instances such data were unavailable, and recourse was made to condensed-phase data. Provided that care was taken to exclude those species whose geometries were likely to be significantly distorted by crystal forces, or which carried a large formal charge, condensed-phase data of the type found in the CSD were regarded as being suitable as reference data.

Because earlier methods used only a limited number of reference data, most of the cases where the method gave bad results were not discovered until after the method was published. In an attempt to minimize the occurrence of such unpleasant surprises, the set of reference data used was made as large as practical. To this end, where there was a dearth or even a complete absence of experimental reference data, recourse was made to high level calculations. Thus, for the Group VIII elements, there are relatively few stable compounds, and the main phenomena of interest involve rare gas atoms colliding with other atoms or molecules, so reference data representing the mechanics of rare gas atoms colliding with other atoms was generated from the results of ab-initio calculations. Additionally, there is an almost complete lack of thermochemical data for many types of complexes involving transition metals, so augmenting what little data there was with the results of ab-initio calculations was essential.

Use of Ab-Initio results

Ab-initio calculations provide a convenient source of reference data; for this work, extensive use has been made of results of Hartree Fock and B3LYP density functional [25, 26] methods (DFT), both with the 6–31G(d) basis set for elements in the periodic table up to argon. For systems involving heavier elements, the B88–PW91 functional [27, 28] was used with the DZVP basis set. Within the spectrum of ab-initio methods these methods are not particularly accurate; many methods with larger basis sets and with post-Hartree-Fock corrections are more accurate. However, the methods used in this work were chosen because they were regarded as robust, practical methods, allowing many systems to be modeled in a reasonable amount of time, a condition that could not be achieved with the more sophisticated ab-initio methods.

Procedure used in deriving ΔHf

Reference heats of formation, ΔHf, for compounds and ions of elements for which there was a paucity of data were derived from DFT total energies in two stages. In the first stage, a basic set of ∼1,400 well-behaved compounds, for which reliable reference values of experimental ΔHf were available, was assembled. Only compounds containing one or more of the elements H, C, N, O, F, P, S, Cl, Br, and I were used. For this set, a root-mean-square fit was made to the reference ΔHf using the calculated total energies, E tot and the atom counts. Thus, the error function, S, in Eq. (1) was minimized.

$$ S = {\sum\limits_j {{\left( {\Delta H_{j} {\left( {\operatorname{Re} {\text{f}}{\text{.}}} \right)} - 627.51{\left( {E_{{{\text{Tot}}}} + {\sum\limits_i {C_{i} n_{i} } }} \right)}} \right)}^{2}_{j} } }$$
(1)

In this expression, the C i are constants for each atom of type i, and the n i are the number of atoms of that type.

In the second stage, the contribution to the total energy of compounds containing element X arising from the elements in the first stage was removed using the coefficients from Equation (1). A second RMS fit was then performed. In this, the function minimized, S, was the RMS difference between the reference ΔHf of compound X and the values predicted from the DFT energy, Eq. (2).

$$S = {\sum\limits_j {{\left( {\Delta H_{j} {\left( {\operatorname{Re} {\text{f}}{\text{.}}} \right)} - 627.51{\left( {E_{{{\text{Tot}}}} + {\sum\limits_i {C_{i} n_{i} } } + C_{x} n_{x} } \right)}} \right)}^{2}_{j} } }$$
(2)

In this expression, the only unknown is the multiplier coefficient C x . After solving for C x , the ΔHf of any compound of X could then be predicted as soon as its DFT total energy was evaluated.

Training set reference data

The training set of reference data used was considerably larger than that used in parameterizing PM3 [7, 8], where approximately 800 discrete species were used. In optimizing the parameters for PM6, somewhat over 9,000 separate species were used, of which about 7,500 were well-behaved stable molecules. The remainder consisted of reference data that were tailored to help define the values of individual parameters or sets of parameters.

Use of rules in parameter optimization

Most reference data can be expressed as simple facts. Indeed, all the earlier NDDO methods were parameterized using precisely four types of reference data: ΔHf, molecular geometries, dipole moments, and ionization potentials. During the development of PM6, however, the use of other types of reference data was found to be necessary. Because of their behavior, these new data are best described as “rules.” In this context, a rule can therefore be regarded as a reference datum that is a function of one or more other data. To illustrate the use of a rule, consider the binding energy of a hydrogen bond in the water dimer. By default, the weighting factor for ΔHf for normal compounds is 1.0 kcal mol−1. With this weighting factor, average unsigned errors in the predicted ΔHf of the order of 3–5 kcal mol−1 would be acceptable, particularly as the spectrum of values of ΔHf spans several hundreds of kilocalories per mole. However, the binding energy of a hydrogen bond in a water dimer is only 5 kcal mol−1. To have an average unsigned error (AUE) of 4 kcal mol−1 in the prediction of hydrogen bond energies would render such a method almost useless for modeling such phenomena.

One way to increase the importance of the hydrogen bond in water would be to increase the weight for the ΔHf of the water molecule, −57.8 kcal mol−1, and the water dimer system, ca. −120.6 kcal mol−1. While this would have the intended effect of increasing the weight of the hydrogen bond energy, it would also have the undesired effect of increasing the weight of the ΔHf of water.

An alternative would be to express the ΔHf of the water dimer in terms of the ΔHf of two individual water molecules. The difference between the two ΔHf, that of water dimer and that of two isolated water molecules, would be the energy of the hydrogen bond. If the weight assigned to this quantity were then increased, it would increase the weight for the hydrogen bond energy without also increasing the weight for the ΔHf of water. Such a reference datum is referred to here as a rule. That is, rules relate the ΔHf of a moiety to that of one or more other moieties. Thus, in the above example, the simple reference datum H, representing the ΔHf of an isolated water molecule, could be expressed as:

$${\text{H}} = - 57.8$$

Using a rule-based reference datum to represent the strength of the hydrogen bond, and giving a weight of 10 to the hydrogen bond energy, the ΔHf of the water dimer would then be defined as

$${\text{H = 10}}{\left( {{\text{ - 5 + H}}_{{{\text{H2O}}}} {\text{ + H}}_{{{\text{H2O}}}} } \right)}$$

In this expression, HH2O was the calculated ΔHf, in kcal mol−1, of an isolated water molecule. This rule could be interpreted as “The calculated strength of the hydrogen bond formed when two water molecules form the dimer should be 5 kcal mol−1, and the importance should be 100 times that of ordinary heats of formation.”

Rules are very useful in defining the parameter hypersurface. Examples of such tailoring are as follows:

Correcting qualitatively incorrect predictions

During the parameterization of transition metals, some systems were predicted to have qualitatively the wrong structure. For example, [CuIICl4]2− was initially predicted to have a tetrahedral structure, instead of the D2d geometry observed. To induce the parameters to change so as to make the D2d geometry more stable than the Td geometry, a rule was added to the set of reference data for copper compounds. This rule was constructed using the results of B3LYP calculations on [CuIICl4]2−. First, the total energies of the optimized B3LYP structure and that of the structure resulting from the semiempirical calculation were evaluated. The difference between these energies was then used in constructing the rule. In this case, the rule was that “The ΔHf of the geometry predicted by the faulty semiempirical method should be n.n kcal mol−1 more than that of the B3LYP geometry.” When such a rule was included in the parameter optimization, with an appropriate large weight, any tendency of the parameters to predict the incorrect geometry resulted in a large contribution to the error function. That is, with the new rule in place, there was a strong disincentive to prediction of the incorrect structure. Usually one rule was sufficient to correct most qualitative errors, but for a few complicated structures more than one rule was needed. The commonest need for multiple rules occurred when, initially, one rule was used to correct a faulty prediction and, after re-optimizing the parameters, the geometry optimized to a new structure that was distinctly different from either the correct structure or the incorrect structure covered by the rule. When that happened, the procedure just described was repeated, and a new rule added to the set of reference data to address the new incorrect structure. In extreme cases, several such rules might be needed, each one defining a geometry that was incorrect and should therefore be avoided.

Rare gas atoms at sub-equilibrium distances

For some elements, specifically those of Group VIII, there is an understandable shortage of useful experimental reference data. In addition, most simulations involving these elements are likely to involve a rare-gas atom dynamically interacting with another atom or with a molecule at distances significantly less than the equilibrium distance. This makes determining the potential energy surface at sub-equilibrium distances important. As with hydrogen bond energies, the energies involved in this domain are likely to be in the order of a few kcal mol−1. The shape of the potential energy surface (PES) can readily be mapped using DFT methods. By selecting two or three representative points on this PES, reference data rules can be constructed that describe the mechanical properties of the interactions. As with hydrogen bonding, a large weight can be assigned to these rules.

Use of rules to restrain parameter values

In general, uncharged atoms that are separated by a distance sufficiently large so that all overlaps between orbitals on the two atoms are vanishingly small will not interact significantly, and what interaction energy exists would arise from VDW terms: of their nature, these are mildly stabilizing. Although statements of this type are obviously true, when they are expressed as rules and added to the training set of reference data they can help define the parameter values. For a pair of atoms, A and B, a simple diatomic system would be constructed in which the interatomic separation was the minimum distance at which any overlaps of the atomic orbitals would still be insignificant. The electronic state of such a system would then be the sum of the states of the two isolated atoms. Thus, if both A and B were silicon, then, since the ground state of an isolated silicon atom is a triplet, the combined state would be a quintet. Because the two atoms do not interact significantly, a rule could then be constructed that said “The energy of the diatomic system is equal to the addition of energies of the two individual systems.” By giving this rule a large weight, any tendency of the method to generate a spurious attraction or repulsion between the atoms would be prevented.

Atomic energy levels

In keeping with the philosophy that a large amount of reference data should be used in the parameter optimization, spin-free atomic energy levels were used for most elements. The exceptions were carbon, nitrogen, and oxygen, where there were enough conventional reference data that the addition of atomic energy levels would not significantly improve the definition of the parameter surface.

NDDO approximations do not allow for spin-orbit coupling. Therefore, spin-free levels were needed. For a few elements, there were insufficient spin states to allow the spin-free energy levels to be calculated. For all the remaining elements, spin-free energy levels were calculated.

In Moore’s compendia [2931] of atomic energy levels, observed emission spectra were used in determining the energy levels of the various states of neutral and ionized atoms. Most of these energy levels were characterized by three quantum numbers: the spin and orbital angular momenta, and the “J” or spin-orbit quantum number. The starting point for determining the spin-free atomic energy levels for a given element consisted of identifying each complete manifold of atomic energy levels for that element, that is, each set of levels split by spin-orbit coupling. If all members of the set were present, i.e., all energy levels from L+S to |L−S|, then the weighted barycenter of energy could be calculated. The spin-free energy level, E, was derived from the spin-split levels E(S,L,J) using Eq. (3).

$$ E = \frac{1} {{{\left( {2S + 1} \right)}{\left( {2L + 1} \right)}}}{\sum\limits_{J = {\left| {L - S} \right|}}^{L + S} {{\left( {2J + 1} \right)}E{\left( {S,L,J} \right)}} } $$
(3)

In those cases where the ground state of an atom was itself a member of a spin-split manifold, the barycenter of the ground state manifold was calculated and used in re-defining the spin-free ground state. For all elements except tungsten, this change in definition was benign. There is a 7S3 level present in tungsten that is located only 8.4 kcal mol−1 above the ground state. This puts it inside the 5DJ, manifold, which has a barycenter at 12.7 kcal mol−1. The effect of this was that, on going from a spin-split to a spin-free ground state, the ground state changed from 6d 25d 4 or 5D to 6d 15d 5 or 7S, and the 5D state now became an excited state with an energy of 4.4 kcal mol−1. To allow for this, a corresponding change was made to the ground state configuration in the PM6 definition of tungsten.

Where there were relatively few other reference data, the singly-ionized, and, in rare cases, the doubly-ionized, spin-free states were also evaluated and used as reference data.

Each energy level contributed one reference datum to the training set. Most atoms have a large number of atomic energy levels, so in order to minimize the probability that a level might be incorrectly assigned, each level was labeled with three quantum numbers: the total spin momentum, the total angular momentum, and the principal quantum number for these two quantum numbers. These were compared with the corresponding values calculated from the state functions. Since each set of three quantum numbers is unique, the potential for miss-assignment was minimized. In rare cases, particularly during the early stages of parameter optimization, two states with the same total spin and angular quantum numbers would be interchanged, with the result that the calculated principal quantum number would also be interchanged. All such cases always involved the ground state, and were quickly identified and corrected.

Approximations

Most of the approximations used in PM6 are identical to those in AM1 and PM3. The differences are:

Core-core interactions

In the original MNDO set of approximations, two changes were made to the simple point-charge expression for the core-core repulsion term. Beyond about five Ångstroms, there should be no significant interaction of two neutral atoms. However, in MNDO, the two-electron, two-center \(\left\langle {\left. {s_{A} s_{A} } \right|\left. {s_{B} s_{B} } \right\rangle } \right.\) integrals and the electron-core interactions do not converge to the exact point charge expression; instead, they are always slightly smaller. To prevent there being a small net repulsion between two uncharged atoms, the core-core expression is modified by the exact 1/RAB term being replaced by the term used in the \(\left\langle {\left. {s_{A} s_{A} } \right|\left. {s_{B} s_{B} } \right\rangle } \right.\) integrals. An additional term is needed to represent the increased core-core repulsion at small distances due to the unpolarizable core. These two changes can be expressed as the MNDO core-core repulsion term as shown in Eq. (4).

$$E_{n} {\left( {A,B} \right)} = Z_{A} Z_{B} \left\langle {\left. {s_{A} s_{A} } \right|\left. {s_{B} s_{B} } \right\rangle } \right.{\left( {1 + e^{{ - \alpha _{A} R_{{AB}} }} + e^{{ - \alpha _{B} R_{{AB}} }} } \right)}$$
(4)

This approximation works well for most main-group elements, but when molybdenum was being parameterized, Voityuk [14] found that the errors in heats of formation and geometries were unacceptably large, and good results were achieved only when a diatomic term was added to the core-core approximation, as shown in Eq. (5).

$$E_{n} {\left( {A,B} \right)} = Z_{A} Z_{B} \left\langle {\left. {s_{A} s_{A} } \right|\left. {s_{B} s_{B} } \right\rangle } \right.{\left( {1 + x_{{AB}} e^{{ - \alpha _{{AB}} R_{{AB}} }} } \right)}$$
(5)

When PM3 parameters for elements of Groups IA were being optimized, the MNDO approximation to the core-core expression was found to be unsuitable. In these elements there is only one valence electron so the core charge is the same as that of hydrogen. A consequence of this was that the apparent size of these elements was also approximately that of a hydrogen atom, in marked contrast with observation. For these elements, diatomic core-core parameters were also found to be essential.

Further examination showed that when diatomic parameters were used, there was always an increase in accuracy; therefore, in the current work, Eq. (4) was replaced systematically by Eq. (5).

As the interatomic separation increased, Voityuk’s equation converged to the exact point-charge interaction, as expected. However, for rare gas interactions, an increase in accuracy was found when the rate of convergence was increased by the addition of a small perturbation. Subsequently, the perturbed function was found to be generally beneficial. Because of this, the general form of the core-core interaction used in PM6 is that given in Eq. (6).

$$E_{n} {\left( {A,B} \right)} = Z_{A} Z_{B} \left\langle {\left. {s_{A} s_{A} } \right|\left. {s_{B} s_{B} } \right\rangle } \right.{\left( {1 + x_{{AB}} e^{{ - \alpha _{{AB}} {\left( {R_{{AB}} + 0.0003R^{6}_{{AB}} } \right)}}} } \right)}$$
(6)

At normal chemical bonding distances, Eqs. (5) and (6) have essentially similar behavior, but at distances of greater than about 3 Å the effect of the perturbation is to make the PM6 function significantly smaller than the Voityuk approximation.

d-orbitals on main-group elements

Thiel and Voityuk have shown [13] that a large increase in accuracy results when d-orbitals are added to main-group elements that have the potential to be hypervalent. During preliminary stages of this work, d-orbitals were excluded from main-group elements, and the parameters were optimized. This work was then repeated but with d-orbitals on various main-group elements. The results were in accordance with Thiel’s observation: the accuracy of the method increased significantly. Because of this, d-orbitals were added to several main-group elements: the value of the increased accuracy far outweighs the extra computational cost.

The effect of the addition of d-orbitals was fundamentally different between main-group elements and transition metals. For main-group elements, the effect of d-orbitals is merely a perturbation: to a large degree the chemistry of these elements is determined by the s and p atomic orbitals. This is not the case with transition metals, where the d-orbitals are of paramount importance and the s and p orbitals are of only very minor significance. In recognition of the importance of the s and p shells in main-group chemistry, specific parameters are used for the five one-center two-electron integrals. Conversely, for the transition metals, the values of these integrals are derived directly from the internal orbital exponents.

Unpolarizable core

As noted earlier, the NDDO core-core interaction is a function of the number of valence electrons. For elements on the left of the periodic table these numbers are small and can cause the elements to appear to be too small. This was part of the rationale behind the adoption of Voityuk’s diatomic core-core parameters. However, even the Voityuk approximation failed during parameter optimization when, in rare cases, a pair of atoms would approach each other very closely. Examination of these catastrophes indicated that the cause was the complete neglect of the unpolarizable core of the atoms involved. To allow for its presence, the core-core interaction for all element pairs was modified by the addition of a simple function, f AB , based on the first term of the Lennard-Jones potential [32]. A candidate function was constructed, Eq. (7), using the fact that, to a first approximation, the size of an atom increases as the third power of its atomic number.

$$ f_{{AB}} = c{\left( {\frac{{{\left( {Z^{{\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 3}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$3$}}}_{A} + Z^{{\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 3}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$3$}}}_{B} } \right)}}} {{R_{{AB}} }}} \right)}^{{12}} $$
(7)

The value of c was set to 10−8, this being the best compromise between the requirements that the function should have a vanishingly small value at normal chemical distances. That is, under normal conditions the value of the function should be negligible, and at small interatomic separations the function should be highly repulsive, i.e., that it should represent the unpolarizable core.

Individual core-core corrections

For a small number of diatomic interactions, the general expression for the core-core interaction was modified in order to correct a specific fault. Because it is desirable to keep the methodology as simple as possible, modifications of the approximations were made only after determining that the existing approximations were inadequate. The diatomic specific modifications were:

O–H and N–H

In the original MNDO formalism, the general core-core interaction, Eq. (4), was replaced in the cases of O–H and N–H pairs with Eq. (8).

$$E_{n} {\left( {A,B} \right)} = Z_{A} Z_{B} \left\langle {\left. {s_{A} s_{A} } \right|} \right.\left. {s_{B} s_{B} } \right\rangle {\left( {1 + R_{{AB}} e^{{ - \alpha _{A} R_{{AB}} }} + R_{{AB}} e^{{ - \alpha _{B} R_{{AB}} }} } \right)}$$
(8)

An unintended effect of this change was that at distances where hydrogen-bonding interactions are important, the diatomic contribution to the ΔHf is greater than if the general approximation, Eq. (4), had been used. This contributed to a reduced hydrogen-bonding interaction in MNDO, and was a contributor to the need for modified core-core interactions in AM1 and PM3.

In PM6, the MNDO core-core approximation is replaced by Voityuk’s diatomic expression, but even with that modification, the resulting hydrogen bond interaction energy was too small. In an attempt to increase it, the Voityuk approximation was replaced by Eq. (9).

$$E_{n} {\left( {A,B} \right)} = Z_{A} Z_{B} \left\langle {\left. {s_{A} s_{A} } \right|\left. {s_{B} s_{B} } \right\rangle } \right.{\left( {1 + x_{{AB}} e^{{ - \alpha _{{AB}} R^{2}_{{AB}} }} } \right)}$$
(9)

At normal O–H and N–H separations, approximately 1 Å, Eqs. (5) and (9) have similar values, but at hydrogen bonding distances, ∼2 Å, the contribution arising from the exponential term is significantly reduced, resulting in a corresponding increased hydrogen bond interaction energy.

C–C

After optimizing all parameters, it was found that compounds containing yne groups, -C≡C-, were predicted to be too stable by about 10 kcal mol−1 per yne group. This error was unique to compounds with extremely short C–C distances, and in light of the increased emphasis on accurately reproducing the properties of organic compounds, the C–C core-core term was perturbed by the addition of a repulsive term. This term was optimized to correct the error in the yne groups and to have a negligible effect on all other C–C interactions. The optimized form of the C–C core-core interaction is given in Eq. (10).

$$E_{n} {\left( {A,B} \right)} = Z_{A} Z_{B} \left\langle {\left. {s_{A} s_{A} } \right|} \right.\left. {s_{B} s_{B} } \right\rangle {\left( {1 + x_{{AB}} e^{{ - \alpha _{{AB}} {\left( {R_{{AB}} + 0.0003R^{6}_{{AB}} } \right)}}} + 9.28e^{{ - 5.98R_{{AB}} }} } \right)}$$
(10)

Si–O

During testing of PM6, neutral silicate layers of the type found in talc, H2Mg3Si4O12, were found to be slightly repulsive instead of being slightly bound. An attempt was made to correct for this error by adding a weak perturbation to the Si–O interaction, illustrated by Eq. (11).

$$E_{n} {\left( {A,B} \right)} = Z_{A} Z_{B} \left\langle {\left. {s_{A} s_{A} } \right|} \right.\left. {s_{B} s_{B} } \right\rangle {\left( {1 + x_{{AB}} e^{{ - \alpha _{{AB}} {\left( {R_{{AB}} + 0.0003R^{6}_{{AB}} } \right)}}} - 0.0007e^{{ - {\left( {R_{{AB}} - 2.9} \right)}^{2} }} } \right)}$$
(11)

Nitrogen sp 2 pyramidalization

Although PM6 predicted the degree of pyramidalization of primary amines correctly, it overestimated the pyramidalization of secondary and tertiary amines. The degree of pyramidalization of these amines was decreased by adding a function to make the calculated ΔHf more negative as the nitrogen became more planar, as shown in Eq. (12).

$$ \Delta {H}\ifmmode{'}\else$'$\fi_{f} = \Delta H_{f} - 0.5e^{{ - 10\phi }} $$
(12)

In this equation, the angle ϕ is a measure of the non-planarity of the nitrogen environment, and is given by 2π minus the sum of the three contained angles about the nitrogen atom. For planar sp 2 secondary and tertiary amines, this correction amounted to 0.5 kcal mol−1 per nitrogen atom.

More elements

The NDDO basis sets of many of the elements parameterized in PM6 have not previously been described. For all elements except hydrogen, which has only an s orbital, the basis set consists of an s orbital, three p orbitals, and, for most elements, a set of five d orbitals. Slater atomic orbitals are used exclusively; these are of form:

$$ \varphi = \frac{{{\left( {2\xi } \right)}^{{n + \raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 2}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$2$}}} }} {{{\left( {{\left( {2n} \right)}!} \right)}^{{\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 2}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{$2$}}} }}r^{{n - 1}} e^{{ - \xi r}} Y^{m}_{l} {\left( {\theta ,\phi } \right)} $$

Where ξ is the orbital exponent, n is the principal quantum number (PQN), and the Y l m(θ, ϕ) are the normalized real spherical harmonics. The PQN are those of the valence shell, i.e., the set of atomic orbitals most important in forming chemical bonds. For PM6, the PQN used are shown in Table 1. For most main-group elements, the s and p PQN are the same, and, when d orbitals are present, all three PQN are the same: that is, the PQN are (ns, np, nd). For transition metals, the d PQN is one less than that of the s and p shells, i.e., (ns, np, (n–1)d). An exception to this generalization occurs in the elements of Group VIII. Here, the valence shell is completely filled, so in all chemical interactions that could occur between an atom of a Group VIII element and any other atom, electron density could only migrate from the Group VIII element to the other atom. That is, when a rare gas element forms any type of chemical bond it would necessarily become slightly positive. This is an unrealistic result. In order to allow rare gas atoms to have the potential of being slightly negative, the set of valence orbitals was changed from (ns, np) to (np, (n+1)s), for the elements Ne, Ar, Kr, and Xe. Helium is the only exception to this change, because it does not have a “1p” valence shell. For helium, the valence shell used was (1s, 2p), this being considered the best compromise.

Table 1 Principal quantum numbers for atomic orbitals

Parameter optimization

Background

The objective of parameter optimization is to modify the values of the parameters so as to minimize the error function, S, Eq. (13), representing the square of the differences between the values of reference data, Q ref (i), and the values calculated using the semiempirical method, Q calc (i), with appropriate weighting factors, g i .

$$S = {\sum\limits_i {{\left( {g_{i} {\left( {Q_{{calc}} {\left( i \right)} - Q_{{ref}} {\left( i \right)}} \right)}} \right)}^{2} } }$$
(13)

This process is initiated by rendering the reference data in the training set dimensionless. The default conversion factors are given in Table 2, with weighting factors for reference data represented by rules being much larger, typically in the order of 5–20 kcal mol−1.

Table 2 Default weighting factors for reference data

The elements were divided into four sets: core elements, (H, C, N, and O), other elements important in organic chemistry (F, Na, P, S, Cl, K, Br, I), the rest of the main group, and the transitions metals. Elements were assigned to the different sets based on their presumed degree of importance in biochemistry, and this importance was converted into a weighting factor to be used in the parameterization optimization procedure. Reference data representing species consisting only of core elements were given their default weight. When other elements were present, the weight was set to the default weight times the smallest multiplier shown in Table 2. Thus the default weight for a reference datum involving tetramethyllead, Pb(CH3)4, would be multiplied by 0.8 reflecting the fact that this species contains an element in the main group set.

For a given set of parameters, P, optimization proceeds by calculating the values of all the Q calc (i), their first derivatives with respect to each parameter, P(j), and the second derivatives with respect to every pair of parameters. Evaluating these quantities is time-consuming, and considerable effort was expended in minimizing the need for explicit evaluation of these functions. The most efficient strategy developed [7] involved assuming that, in the region of parameter space near to the current values of the parameters, the values of the first derivatives of the Q calc (i) with respect to P were, at least to a first approximation, constant. By making this assumption the values of the parameters could then be updated using perturbation theory. Because the assumption is only valid in the region of the starting point in parameter space, periodically the focus was moved to the new point in parameter space and a complete explicit re-evaluations of all the functions performed. The parameter optimization process terminated when the scalar of the first derivatives dropped below a preset limit. This process was fully automated, and for given sets of reference data and parameters, parameter optimization could be performed rapidly, easily, and reliably.

Sequence of optimization of parameters

Notwithstanding the reliability of the parameter optimization procedure, a simple global optimization of all the parameters for all 70 elements involving about over 9,000 discrete species was found to be impractical because of the large number of derivatives involved. Such an optimization would involve over 2,000 parameters and over 10,000 reference data. The set of second derivatives alone would consist of 2×1010 terms. With more powerful computers, evaluating such large sets of derivatives might be practical some day, but even then, one faulty reference datum or one faulty initial parameter value would ruin an optimization run. The strategy of parameter optimization was approached with great caution, and the procedure finally adopted was as follows:

Because the elements H, C, N, and O are of paramount importance in biochemistry, and because large amounts of reference data are available, the starting point for parameter optimization involved the simultaneous optimization of parameters for these four elements. For the purposes of discussion, this set of four elements will be called the “core elements”.

Once stable parameters had been obtained, parameters for other elements important in organic chemistry were optimized in two stages. First, the parameters for the core elements were held constant, and parameters for the elements F, P, S, Cl, Br, and I were optimized one at a time. Then all parameters for all ten elements were simultaneously optimized. This set (the organic elements) was then used as the starting point for parameterizing the rest of the main group.

The same sequence was followed for the rest of the main-group elements. That is, parameters for each element were optimized while freezing the parameters for the organic elements. Then, once all the elements had been processed, all parameters for all of the 39 main-group elements, plus zinc, cadmium, and mercury, were optimized simultaneously.

When parameters for the transition metals were being optimized, all parameters for the main group elements were held constant. There were several reasons for this. Most importantly, the reference data for the transition metals, particularly the thermochemical data, was of lower quality, so one consideration was to prevent the transition metals from having a deleterious effect on the main-group elements. Another important consideration was that most compounds involving transition metals also involved only elements of the organic set. Since parameters for these elements had been optimized using a training set consisting of all the main-group elements, the values of the optimized parameters would likely be relatively insensitive to the influence of the small number of additional reference data involving transition metals.

In general, all parameters for a given element were optimized simultaneously; this was both efficient and convenient. In some optimizations, specifically those involving a new element, only sub-sets of parameters were used. Three main sub-sets were used:

Parameters that determine atomic electronic properties

For most elements, atomic energy levels are determined by six parameters: the one-electron one-center integrals Uss, Upp, Udd, and the internal orbital exponents ζsn, ζpn and ζdn. If the heat of ionization and sufficient atomic energy level data were available, these quantities could be uniquely defined; there would be no need for the use of molecular reference data. These parameters were the first to be optimized whenever an optimization was started for an element that had not previously been parameterized

Parameters that determine molecular electronic properties

Two of the more important electronic molecular properties are the dipole moment, which indicates the degree of polarization within a molecule, and the ionization potential. These properties are determined primarily by 12 parameters: the six parameters that determine atomic electronic properties and six additional parameters: βs, βp, and βd and the Slater orbital exponents ζs, ζp, and ζd. In the second stage of parameter optimization, the first six parameters were held constant at the values defined using atomic data and the second set optimized. During this operation, all geometries were fixed at their reference values.

Parameters that determine geometries

As soon as an initial optimized set of electronic parameters was available, the diatomic and other core-core parameters could be optimized. The most efficient process was to optimize these parameters initially without allowing the electronic parameters or the molecular geometries to optimize. If geometries were allowed to optimize, optimization of the core-core parameters would be slowed considerably, because of the tight dependency of the optimized geometries on the values of the core-core parameters, and vice versa.

As soon as all parameters had been optimized using fixed geometries, the geometries were allowed to relax and the parameters that determine geometry re-optimized. After that there would be three sets of incompletely optimized parameters: the six atomic electronic parameters, the six molecular electronic parameters and the core-core parameters. The only remaining operation was the simultaneous optimization of all the parameters. If the training set of reference data was insufficient to unambiguously define the values of all the parameters, then, at that stage, the potential existed for the parameters to become ill-defined. An example of this would be where there were too few atomic energy levels to allow all six parameters in the first set to be defined. To allow for this, a penalty function was added to each parameter. If the values of a parameter exceeded pre-defined limits, the error function S was incremented by a constant times the square of the excess. No penalty was applied if the value of a parameter was between the pre-defined limits; that is, no bias was applied to the numerical value of a parameter. During the early stages of simultaneous optimization of all the parameters for a given element the penalty function was used frequently. In the later stages the penalty function was invoked rarely, and then only when there was a distinct shortage of reference data.

Results

Parameters for PM6

PM6 atomic and diatomic parameters for the 70 elements are presented in Tables 3 and 4, respectively. Not all elements have all parameters: where monatomic parameters are missing, the associated approximations were not used. For diatomic parameters, where an atom-pair is missing, no representatives of that type of bond were used.

Table 3 PM6 parameters for 70 elements
Table 4 Diatomic core−core parameters

Accuracy

Comparison with other semiempirical methods

Using the program MOPAC2007 [33], an extensive comparison was made between the results obtained using PM6 and those from PM5, PM3, and AM1. This comparison was started by generating tables of reference data (that is, ΔHf, geometries, ionization potentials (I.P.s), and dipole moments) and differences between the calculated and reference values, using each of the four methods presented here. Because of their size they are provided in the supplementary material. To simplify navigating within the tables, all species are listed in the order of their empirical formula.

Average unsigned errors (AUE) for ΔHf for each element parameterized at the PM6 level are shown in Table 5, together with AUE for PM5, PM3, and AM1. The number of data used in each average varies depending on the elements available in each method. AM1 boron [34] uses a different core-core interaction expression from the other elements and was not used. AUE for bond-lengths are shown in Table 6. In those cases where a calculated bond-length was very large, indicating that the bond had broken, the bond-length was not used in the analysis. If such data had been used, the resulting statistics would have been misleading. AUE for angles are shown in Table 7. Errors in angles for many elements that form very ionic, i.e., labile, bonds are of less importance than errors involving elements that form strong covalent bonds. The angles subtended by such bonds are often determined largely by the electronic structure of the atom. Information on the accuracy of prediction of molecular electronic structure can also be inferred from the AUE of dipole moments, Table 8, and ionization potentials, Table 9.

Table 5 Average unsigned errors in calculated heats of formation (kcal mol−1)
Table 6 Average unsigned errors in bond lengths (Å)
Table 7 Average unsigned errors in bond angles (Degrees)
Table 8 Average unsigned errors in dipole moments (D)
Table 9 Average unsigned errors in ionization potential (eV)

Comparison of the accuracy of PM6 with the other NDDO methods PM5, PM3, and AM1, was made more complicated by the fact that different sets of elements were available in each method. To allow a simple comparison, therefore, average unsigned errors (AUE) for the four common properties for various subsets are presented in Tables 10, 11, 12, 13 and 14. To ensure a valid comparison the same number of data were used in each method, except for AM1 in “whole of main group”, where data for cadmium and boron were not used.

Table 10 Average unsigned errors in ΔHf for various sets of elements (kcal mol–1)
Table 11 Average unsigned errors in bond lengths for various sets of elements (Å)
Table 12 Average unsigned errors in angles for various sets of elements (Degrees)
Table 13 Average unsigned errors in dipole moments for various sets of elements (D)
Table 14 Average unsigned errors in I.P.s for various sets of elements (eV)

Comparison with AM1*

Winget, et al. [15], developed AM1* parameters for P, S, and Cl, in which Voityuk’s diatomic parameters were used for all atom-pairs involving P, S, and Cl with H, C, N, O, F, P, S, Cl and Mo. In the AM1* method, all parameters for elements other than the ones being optimized are held constant at the AM1 values. As such, AM1* could be regarded as a hybrid method: parameters for a few individual elements are re-optimized, in this case with some changes in the set of approximations, while holding the parameters for the other methods constant at their AM1 values. Tables comparing individual P, S, and Cl species calculated with AM1* and PM6 are given in the supplementary material. A summary of the statistical analysis is given in Table 15. Winget et al. also reported AM1* parameters for titanium and zirconium [15]. These parameters were not used in the comparison given here because the set of approximations used was incompatible with the set used in PM6.

Table 15 Average unsigned errors in phosphorus, sulfur, and chlorine

Comparison with RM1

In 2006, ten elements, H, C, N, O, F, P, S, Cl, Br, and I, that had been parameterized at the AM1 level were re-parameterized [35]; the result was a new method, RM1. No changes were made to the set of approximations used, so that, for example, P, S, Cl, Br, and I used only the s-p basis set. That is, RM1 was functionally identical to AM1. A statistical analysis showed that RM1 was more accurate than any of the other NDDO methods, and therefore was the method of choice for modeling organic compounds. An indication of the effect of the current changes to the set of approximations can be obtained by comparing the AUE for PM6 and RM1 in Tables 10, 11, 12, 13 and 14.

Voityuk reported the parameterization of molybdenum [14] at the AM1* level. These parameters were added to the standard AM1 parameters and were used in the analysis.

Comparison with high-level methods

A comparison of PM6, HF 6–31G(d) and B3LYP 6–31G(d) errors in predicted ΔHf for 1373 compounds is given in the supplementary material. Only compounds containing the elements H, C, N, O, F, P, S, Cl, and Br were considered, these being the more important elements in biochemistry. Ab-initio ΔHf were obtained from the calculated total energies by the addition of a simple atomic correction and conversion from atomic units to kcal mol−1. No allowance was made for thermal population effects, zero point energies, etc., the assumption being made that such effects could be absorbed into the atomic corrections.

A statistical analysis of errors in thermochemical predictions for the three methods is given in Table 16. A check was also done to verify that the error distribution was approximately Gaussian. The resulting histogram, shown in Fig. 1, shows that the distribution is indeed Gaussian.

Table 16 Statistical analysis of errors in predicted ΔHf for various methods (kcal mol−1)
Fig. 1
figure 1

Histogram of errors in calculated ΔHf

Hydrogen bonding

One of the commonest forms of hydrogen bonding involves a hydrogen atom attached to an oxygen atom and forming a weak bond to a distant oxygen atom. The simplest, well-characterized case is that of the water dimer. In an exhaustive analysis of this system, Tschumper, et. al. [36], characterized this system using CCSD(T) and a large basis set. They identified and characterized ten stationary points on the 12-dimensional potential energy surface of the dimer and determined that the lowest energy conformer of the water dimer was 5.00 kcal mol−1 more stable than two isolated water molecules. A comparison of the relative heats of formation of these points calculated using NDDO methods is shown in Table 17. The AUE for the various methods are as follows: PM6: 1.35 kcal mol−1, PM5: 3.35, PM3: 2.16, and AM1: 1.67.

Table 17 Relative energies of conformers of water dimer

The energies of various different types of hydrogen bonds were estimated from the energy released when the two small molecules involved associate to form a hydrogen-bonded system. Table 18 lists the values predicted using B3LYP and the NDDO methods.

Table 18 Comparison of B3LYP and PM6 hydrogen bond energies (kcal mol−1)

Nitrogen pyramidalization

A well-documented fault in PM3 nitrogen was its exaggerated degree of pyramidalization when in the sp 2 configuration. This is dramatically evident in N-methylacetamide, where the H-N-C–C torsion angle should be 180 °, but is predicted by PM3 to be 136 °. That is, the nitrogen, instead of being in a planar environment, is predicted to be highly pyramidal. The results of a survey of 19 molecules that contain sp 2 nitrogen are presented in Table 19.

Table 19 Average errors in pyramidalization of nitrogen (Torsion angle about nitrogen, in degrees)

Transition metals

Optimizing parameters for transition metals was not as straightforward as for the main group elements. As with the main group compounds, there is a wealth of structural reference data on transition metal complexes. However, unlike main group compounds, there is a distinct shortage of reliable thermochemical data. To alleviate this shortage, the thermochemical data that was available was augmented by the results of DFT calculations. It was recognized, however, that these derived reference data were likely to be of a lower accuracy than the experimental data. Many transition metal complexes are also highly labile; a consequence of this was that some moieties that are known to exist in the solid phase were predicted to be unstable in the gas phase, at least at the PM6 level of calculation. In most cases, such moieties had a high formal charge, therefore, without any countercharge, their instability in isolation is understandable. When an intrinsically unstable ion was identified, it was removed from further consideration.

Most transition metal compounds also have extensive UV-visible properties, arising from d-d transitions and from charge-transfer excitations, the presence of these absorption bands being indicative of the existence of low-lying electronic excited states. The self-consistent field (SCF) equations frequently did not converge unless special techniques were used. One of these, using the direct inversion of the iterative sub-space [37], or DIIS, would frequently yield an SCF when other methods failed. However, as a result of the way it works, the DIIS converged the wavefunction to the nearest stationary point, not necessarily to the lowest energy point. Because of the potential existence of multiple low-lying excited states, special care had to be taken when the DIIS technique was used. Conversely, the tendency to converge to the nearest stationary point was an advantage when electronic states of transition metal atoms were being optimized. In several instances, the lowest energy wavefunction corresponded to a hybrid of s, p and d atomic orbitals that did not transform as any irreducible representation of the group of the sphere. In those cases, the wavefunction could be induced to converge to the correct spherical harmonic solution by using the DIIS procedure.

Sets of transition metals

For the purpose of discussion, the set of 30 transition metals can be partitioned into eight of the groups of the Periodic Table, with each group containing one or more triads of elements. A detailed discussion of each element is impractical because of the wide range of compounds in transition metal chemistry. The following section, therefore, will be limited to systems where PM6 does not work well, and to systems illustrative of the structural chemistry of specific elements.

Group IIIA: Scandium, Yttrium, Lanthanum, and Lutetium

Possibly because of its scarcity, only a few experimental thermochemical reference data for scandium compounds were available for use in the parameterization. What reference data existed were augmented by the results of DFT calculations and with a large number of atomic energy levels for the neutral and ionized atom. Only the chemistry of ScIII was studied. Most bond lengths involving scandium were reproduced with good accuracy (for example tri(η5−cyclopentadienyl)-scandium, Fig. 2), the exception being the coordination complex [Sc(H2O)9]3+ which PM6 predicts to decompose to [Sc(H2O)7]3+ plus two water molecules.

Fig. 2
figure 2

Tri(η5−cyclopentadienyl)-scandium Reference value in parenthesis

As with scandium, very few thermochemical reference data were found for yttrium or lanthanum. To compensate for this, extensive use was made of the CSD. The chemistry of lutetium is similar to that of lanthanum, with the principal difference being that whereas LaIII has an empty 4f shell, in LuIII that shell is completely filled. Since the 4f shell is, at least chemically, virtually inert, lutetium could be regarded as a conventional transition metal, and was therefore included in this work.

Group IVA: Titanium, Zirconium, and Hafnium

In contrast to all the elements of Group IIIA, titanium is plentiful, and an abundance of reference data on TiIII and the more common TiIV is available. These data include many tetrahedral and octahedral inorganic complexes as well as organotitanium compounds. Most bond lengths are reproduced with good accuracy, the exceptions being the Ti-H bond in TiH4, where the predicted value, 1.36 Å, is 0.37 Å shorter than the reference, and coordination complexes which involve oxygen forming a purely dative bond to titanium. In this latter case, the Ti-O bond is typically too long by 0.1 to 0.3 Å.

The behavior of zirconium and hafnium is similar to that of titanium.

Group VA: Vanadium, Niobium, and Tantalum

Most of the structural chemistry of vanadium in its five common oxidation states, 0, II, III, IV, and V, are reproduced with good accuracy. The common VO5 structure which occurs in bis(Acetylacetonato)-oxo-vanadium(iv), where vanadium forms a double bond to one oxygen atom and single bonds to the other four, is reproduced accurately, the V=O distance being 1.58 Å (reference, 1.56), the V-O distance 2.03 Å (1.97), and the O-V=O angle: 104.5 ° (105.9).

Not all systems were reproduced with such accuracy. When there are several ligands around a vanadium atom, the effects of steric crowding are over-emphasized, and PM6 incorrectly predicts that one of the metal-oxygen bonds would break. An example is bis(bis(μ2-trifluoroacetato-O,O′)-(η5-cyclopentadienyl)-vanadium), where each vanadium atom extends bonds to four oxygen atoms and one cyclopentadienyl. In this system, PM6 predicts that one of the V-O bonds would break.

In the heavier elements there is an increased tendency to form highly symmetric polynuclear complexes. An example is the tantalum dication, [Ta6Cl12]2+. This is predicted to have an octahedral structure in modest agreement with the DFT result (Fig. 3).

Fig. 3
figure 3

Calculated structure of the complex ion [Ta6Cl12]2+ Reference value in parenthesis

Transition metal complexes usually have one or more unpaired electrons; such systems can only be modeled using an open shell method such as unrestricted Hartree Fock (UHF) or restricted Hartree Fock followed by a configuration interaction (RHF-CI) correction. The UHF method is faster and more reliable, and is the method of choice when only simple properties such as heats of formation or geometries are of interest. For [M6X12]2+, M = Nb or Ta, X = Cl or Br, UHF predicts an almost octahedral complex, a very slight distortion lowering the symmetry to D4h. This distortion is also reflected in the asymmetric charge distribution. When RHF-CI is used, the geometry converges on the exact Oh structure.

Group VIA: Chromium, Molybdenum, and Tungsten

Most Cr–O and Cr–N bonds are reproduced well, as illustrated by [CrIII(EDTA)]- in Fig. 4. The organometallic bond in chromium hexacarbonyl is 1.90 Å, which is in good agreement with the crystal structure, 1.92 Å, found in FOHCOU01[21].

Fig. 4
figure 4

Chromium Ethylenediaminetetraacetate anion, [Cr(III)(EDTA)]

The octacyano-molybdate(IV) moiety, [MoIV(CN)8]4−, is a stable eight-coordinate organometallic molybdenum complex ion whose geometry in the crystal is that of a slightly distorted square antiprism. Rather unexpectedly, this structure was reproduced by PM6, the expectation being that in the absence of crystal field forces the structure would have optimized to a geometry which has a higher symmetry, i.e., converged to the exact D4d geometry. The predicted Mo-C distance was 2.22 versus 2.16 Å, again in unexpectedly good agreement for an ion with such a large formal charge.

Molybdenum forms the cluster anion [Mo63-Cl8)Cl6]2− in which the six molybdenum atoms form a regular octahedron. PM6 successfully reproduces this structure, and predicts the following distances: Mo-Mo: 2.30 (2.63), Mo-η3Cl: 2.75 (2.56), and Mo-Cl: 2.50 (2.43 Å).

The trioxide of molybdenum can form polyoxometalates, a typical example of which is the α-keggin heteropolyoxyanion [SiO4@MoVI 12O36]4−. In this structure, shown in Fig. 5, each Mo forms a double bond with one oxygen, single bonds to four other oxygen atoms, and what can only be described as a third of a bond to a sixth oxygen that is part of the SiO4 unit. Despite the apparently high symmetry, Td, this system has only a center of inversion. This low symmetry is reproduced by PM6.

Fig. 5
figure 5

α-Keggin structure of tetraconta-oxo-silicon-dodeca-molybdenum, [SiO4@Mo12O36]4− Crossed-eyes stereo; Mo=O: 1.77 Å (1.69), Mo-O: 2.00 (1.85), Si–O: 1.52 (1.64) (Ref. in parentheses)

PM6 predicts the structures of all three hexacarbonyls with good accuracy, but gives qualitatively the wrong structures for the dinuclear decacarbonyls. This failure to qualitatively predict the structure of the polynuclear carbonyls occurred frequently during the survey of the transition metals.

Group VIIA: Manganese, Technetium, and Rhenium

Like many other transition metals, manganese can form sepulchrates, closo polyhedral complexes of general structure 3, 6, 10, 13, 16, 19-hexaaza-bicyclo(6.6.6)icosane. In contrast to the more common open hexadentate chelates of manganese, e.g. [MnII(EDTA)]2−, the metal atom in a sepulchrate is extremely tightly bound, and cannot be removed without destroying the organic framework. A simple sepulchrate is shown in Fig. 6. PM6 predicts the Mn-N distance with good accuracy but gets the twist angle incorrect. A DFT calculation reproduced the twist angle found in the crystal, which suggests that the error in the twist angle cannot be attributed to the neglect of crystal packing forces.

Fig. 6
figure 6

[Sepulchrate-manganese(III) ]3+ (3,6,10,13,16,19-Hexaaza-bicyclo(6.6.6)icosane)-manganese(III) §: CSD entry: HAFBUL

Although there is a large amount of structural information on technetium compounds, there is a distinct shortage of thermochemical data. To make up for this, almost all the reference heats of formation of representative technetium compounds were derived from DFT calculations. Only one heat of formation was used in this derivation, that of the isolated technetium atom, therefore the reference values used almost certainly include a systematic error that may amount to many kilocalories per mole. Consequently, the reference heats of formation and the errors in PM6 predicted heats of formation of technetium compounds should be taken cum granus salis. However, this should not be construed as implying that they are meaningless: because reactions are balanced, when heats of reaction are evaluated, any systematic errors in the heats of formation are cancelled out.

One of the more important technetium species is the pertechnetate ion, [TcO4], used in nuclear medicine. In this ion, PM6 predicts the Tc-O distance to be 1.73 Å, in good agreement with the DFT value of 1.76 Å.

Group VIIIA: Iron, Cobalt, Nickel, Ruthenium, Rhodium, Palladium, Osmium, Iridium, and Platinum

The geometries of most compounds of this large group were reproduced with modest to good accuracy, including the iron-porphorin complex, Fig. 7, of the type found in heme. The main exception is iron pentacarbonyl, Fe(CO)5, which in its equilibrium geometry is known unambiguously to be of point-group D3h, and which PM6 predicts to be equally unambiguously C4v. When this error was discovered, attempts were made to correct the fault by adding a rule to the training set for iron. This rule stated that “The C4v geometry was 28.7 kcal mol−1 higher in energy than the D3h geometry,” 28.7 kcal mol−1 being the difference between the energies of the two structures calculated using DFT. However, even when a very large weighting factor, 20.0, was used, the C4v structure remained more stable than the D3h, albeit the error in the relative energies was decreased. During this optimization errors in all other iron compounds increased significantly. Rather than accept a general deterioration in the predicted properties of iron compounds, the rule was removed from the training set.

Fig. 7
figure 7

trans-7,8-Dihydro-2,3,7,8,12,13,17,18-octaethylporphyrinato-iron (II) Reference value (CSD entry BUYKUB) in parenthesis

The well-known red complex nickel dimethylglyoxime is normally encountered in the quantitative analysis of inorganic nickel in solution. At the center of the molecule is the planar structure NiN4 structure, which is frequently found in nickel compounds in biochemical systems. PM6 predicts this with good accuracy (Fig. 8).

Fig. 8
figure 8

Nickel Dimethylglyoxime Reference value (CSD entry NIMGLO10) in parenthesis

One of the first polyhapto organometallic complexes discovered was Zeise’s salt. In the anion, [PtCl32-C2H4)], platinum forms a synergic bond with an ethylene molecule. The calculated and X-ray structures of this complex are shown in Fig. 9.

Fig. 9
figure 9

Zeise’s Salt, trichloro-(η2-ethene)-platinate Reference value (CSD entry XIVSAK) in parenthesis

Group IB: Copper, Silver, and Gold

Copper phthalocyanine is an extremely stable blue dyestuff. As with nickel dimethylglyoxime, the planar CuN4 moiety at the center of the porphyrin ring is typical of many copper species of importance in biochemistry. PM6 reproduces it with very good accuracy (Fig. 10).

Fig. 10
figure 10

Copper phthalocyanine Reference value (CSD entry CUPOCY16) in parenthesis

Dimethyl gold cyanide tetramer provides a good example of a square-planar AuIII complex. In this system, each gold atom forms covalent single bonds of length 1.99 Å(2.01) to the carbons of the methyl groups, a weaker, longer bond of length 2.12 Å(2.23) to the carbon of the cyanide group, and a still longer bond, 2.27 Å(2.23) to the nitrogen atom.

Gold also forms small planar clusters. PM6 predicts that neutral clusters of up to about nine gold atoms should be planar, an example being the D6h Au7 cluster, in which the Au-Au distance is predicted to be 2.71 Å(2.01). Clusters of up to 12 gold atoms are also predicted to be stable, provided the cluster has a single negative charge.

Group IIB Zinc, Cadmium, and Mercury

These elements have completely filled d shells; therefore the valence shell can be limited to the s and p orbitals. As such, they behave like main-group elements.

Discussion

Methodological changes

During the development of PM6, only very minor changes were made to the set of approximations. The main change was in the construction of the training set used for parameter optimization. One of the most important changes was the use of rules in the training set to define chemical information that was not a function of any single molecule. In earlier methods the training set had included only standard reference data. Of their nature, such data could not allow for chemical facts that were independent of any one moiety. For example, the strength of a hydrogen bond is of great importance in biochemistry, but it could not be expressed in terms of a single species. By use of rules, the value of some chemical quantity could be related to that of another. In the case of hydrogen bonding, the heat of formation of the water dimer was made a function of the heat of formation of two separated water molecules.

Rules were particularly useful when elements of the three transition metal series were being optimized. Many complexes of these elements are highly labile, and, in the early stages of parameter optimization, there was a strong tendency for the optimized geometry of such complexes to be qualitatively incorrect. Faults of this kind could not be corrected by simply increasing the weight assigned to the correct geometry, so rules were developed to indicate that the faulty geometries were indeed incorrect. Specific points on the potential energy surface were selected, and from single-point high level calculations, the relative energy of these points above the minimum was evaluated. The points selected were precisely those qualitatively incorrect geometries resulting from the use of the then-current parameters. The fact that the incorrect geometry was predicted by high level methods to be of higher energy than the correct geometry was then added to the set of rules. A good example of such a rule was the rule concerning Fe(CO)5 mentioned above, in which the only datum that was defined referred to the relative energies of the compound in two different symmetries. No reference was made to the bond lengths, or bond angles. With such a rule in place, the parameters could be re-optimized to minimize the error arising from the rule, with the effect that the energy of the incorrect symmetry increased relative to that of the correct symmetry. In the majority of cases, one rule of this type was sufficient; less frequently, two rules were used, and, in rare cases, even more rules were necessary.

Another change was the use of very large reference data training sets. In earlier parameterizations, the training set used was deliberately made as small as possible. Only when the resulting method was used in a survey of species not used in the training set could the predictive power of the method be determined. The training set used in the development of PM6 was designed to be considerably larger than the survey set. The rationale for this was that, by including in the training set reference data for unconventional species, e.g., non-equilibrium and hypothetical species, a greater region of the error-function surface could be defined. This would in turn, result in a better definition of the values of the parameters. That this is useful can be evidenced by the recent work in parameterizing chlorine at the AM1* level, where the compound 1,1′,2-trichloro-1,2,2′-trifluoroethane, C2Cl3F3 has a reported ΔHf of –173.7 kcal mol−1, but the value predicted using AM1* was –273.9 kcal mol−1. That is, the AM1* value was in error by over 100 kcal mol−1. If this compound had been included in the training set, it is highly likely that the error would have been significantly reduced.

Although over 10,000 reference data were used in the PM6 training set, there are several indications that even this large number is still inadequate for the definition of the values of the parameters, and that an even larger training set would be highly desirable. In light of this, work has begun on identifying species to be added to the training set. During the testing of PM6, several faults were found in the method. Some of these were quickly traced to specific core-core parameters. One of the hydrogen atoms in the complex [ScIII (H2O)7]3+ was predicted to readily move toward the central atom with the result that a Sc-H bond was formed. Such faults could easily be corrected by the addition to the training set of appropriate reference data from high-level calculations. This was done in several instances, and the specific error was corrected, but this action then also required all the testing to be re-started. Because this was a time-consuming process, when faults were found near the end of the testing phase, the decision was taken that the fault should be noted, as in the Sc-H error mentioned here, and to take no further action at that time.

A different type of error, found only near the end of testing, was the unrealistically large p electron population of some transition metals. The values of the parameters that determine the p population are defined using two very different groups of reference data: atomic energy levels and conventional properties of polyatomics. If atomic energy levels were excluded from the parameter optimization, then the p population would become very small; but if atomic energy levels were excluded, then the resulting method would not be suitable for reproducing such levels. The decision to use all available atomic energy levels in the training set was a value judgement. In the next training set, it is likely that the result of this decision-making process will be different.

Detecting faults in semiempirical methods is difficult, and rather than wait until all errors of this type were found and fixed, a process that could potentially take several more years, the decision was made to freeze the parameters at their current value. Obviously, PM6 still has many errors; some have already been described. Work has already started in an attempt to correct them.

Elimination of computational artifacts

Earlier NDDO methods, particularly PM3 and AM1, produced artifacts in potential energy surfaces as a result of unrealistic terms in the core-core approximation, specifically in the set of Gaussian functions used. In PM6, only one Gaussian-type correction to the core-core potential is allowed, and, consequently, the potential for these artifacts has been reduced. On the other hand, because PM6 uses diatomic parameters, the likelihood of readily-characterized errors involving specific pairs of atoms, e.g. Sc and H, as mentioned earlier, is increased. Errors of this type can be easily eliminated by a re-parameterization of the faulty diatomic.

There are over 450 sets of diatomic interactions parameterized in PM6, covering most of the common types of chemical bonds. But the number of potential bonds is much larger: given 70 elements, there are almost 2500 diatomic sets. If a molecule contains two elements for which the diatomic interaction parameters are missing, then, provided the elements are well separated, say by more than 4 Ångstroms, the absence of the parameters will not be important. If the two elements were near to each other, then the diatomic core-core parameters would be needed. This would involve generating a small training set of reference data that included a few examples of the type of interaction involved, and optimizing the two terms in the diatomic interaction.

This ability to add diatomic parameter sets to PM6 without modifying the underlying parameterization has the advantage that more and more types of interaction can be added without changing the essential nature of the method.

Accuracy

PM6, being the most recent member of the NDDO family of approximate semiempirical methods, is understandably the most accurate. The development of each new method has been guided by the knowledge of the documented faults found in the earlier methods. This is reflected in the steady decrease in AUE of simple organic compounds, from 12.0 kcal mol−1 for AM1 to 4.9 kcal mol−1 for PM6.

Several low-energy phenomena are predicted more accurately by PM6, with the most important of these being the prediction of the energies and geometries involved in hydrogen bonding. One consequence of this increased accuracy is that the lowest energy conformer of acetylacetone is now correctly predicted to be the ene-ol structure, and not the twisted di-one configuration.

Despite the improvement in hydrogen bonding, a significant error was found in the balance of energies involved in forming zwitterions of hydroxyl and amine groups. This is best illustrated by the dimer of 2-aminophenol, where PM6 predicts that the zwitterion should be 3.6 kcal mol−1 more stable than the neutral form, but higher level calculations indicate that the neutral form should be 17.7 kcal mol−1 more stable than the zwitterion. In the solid state, CSD entries AMPHOM01 – AMPHOM10 [21], 2-aminophenol exists as the neutral species.

In general, however, average unsigned errors in ΔHf have steadily decreased as semiempirical methods have evolved. Earlier NDDO methods such as PM3 and AM1 had AUE significantly larger than the 6–31G* Hartree Fock method. With the advent of PM5 and RM1 errors were intermediate between HF and B3LYP. In the current work, AUE in ΔHf are lower than those of both B3LYP and HF 6–31G*. This increase in accuracy of prediction of ΔHf relative to higher level methods should not be construed as disparaging those methods: semiempirical methods in general, and PM6 in particular, were parameterized to reproduce ΔHf. The performance of these methods when applied to non-equilibrium systems, in particular transition states, is likely to be very inferior to that of B3LYP or HF 6–31G*.

As a result of the current work, there is a clear strategy for further improving the accuracy of semiempirical methods. All three potential sources of error need to be addressed. Regarding reference data, considerably more data are needed than were used here. This would likely come from increased use of high-level theoretical methods: methods significantly more accurate than those used here would obviously be needed in any future work. Parameter optimization can be performed with confidence and reliability, particularly when well-behaved systems are used. In all cases examined where problems were encountered in parameter optimization, problems also occurred in the normal SCF calculation in MOPAC2007. This implies that as faults in the SCF procedure are corrected, faults in parameter optimization would also be removed.

Permanent errors

Notwithstanding the optimism just expressed, not all errors can be eliminated by better data and better optimizations. Despite strenuous efforts, some calculated quantities persistently failed to agree with the reference values. Many potential causes for these failures were investigated. In each case the weight for the offending quantity was increased considerably and the parameter optimization re-run. When that was done, the specific error decreased, but errors elsewhere increased disproportionately. Since the final gradient of the error function was acceptably small, it followed that the parameter optimization was not in error. The reference data were checked to ensure that they were in fact trustworthy. Because two of the three possible origins of error had been eliminated, the inescapable conclusion was that there is a fault in the set of approximations. The most serious of these faults was the qualitatively incorrect prediction of the geometry of the exceedingly simple system, iron pentacarbonyl.

Conclusions

The NDDO method has been modified by the adoption of Voityuk’s core-core diatomic interaction parameters. This has resulted in a significant reduction in error for compounds of main-group elements, and, together with Thiel’s d-orbital approximation, allows extension of the NDDO method to the whole of the transition metal block.

The accuracy of PM6 in predicting heats of formation for compounds of interest in biochemistry is somewhat better than Hartree Fock or B3LYP DFT methods, using the 6-31G(d) basis set. For a representative set of compounds, PM6 gave an average unsigned error of 4.4 kcal mol−1; for the same set HF and B3LYP had AUE of 7.4 and 5.2 kcal mol−1, respectively.

The potential exists for further large increases in accuracy. This would likely result from the increased use of accurate reference data derived from high-level methods, and from the development of better tools for detecting errors at an early stage of method development.