1 Introduction

The use of linear free energy relationships to study or predict chemical properties in different solvents is well established. Thus, one writes:

$$ Y = Y^{0} + \sum {s_{i} P_{i} } $$
(1)

where Y represents some chemical property, Y0 is the value of Y in a hypothetical reference state where all siPi terms are zero, the Pi represents the solute or solvent properties, and the si is the response of Y to those properties. In essence, each siPi represents a different type of solute–solvent interaction contributing to Y; so, for example, if P is a measure of solvent hydrogen bond donor acidity, then s should reflect the hydrogen bond acceptor basicity of the solute and sP, the contribution of this interaction to Y. In applying Eq. 1, one can begin with parameters for a series of solvents, Pi, and recover the responses of a solute, si, by regression of Y or, equally, begin with Pi values for a series solutes and regress these to determine the responses, si, of a solvent. Both approaches have been adopted.

A number of experimental parameters, to represent different molecular properties, have been developed, including Kamlet and Taft’s α and β [1, 2], Kamlet, Taft and Abboud’s π*[3], Abraham’s A, B and S [4,5,6], Reichardt and Dimroths ET(30) [7], Catalan’s SB, SA, SP and SdP [8,9,10,11,12,13], and Gutmann’s donor and acceptor numbers [14], among others.

These experimental LSER descriptors are intended as measures of the ability of a solute or solvent to participate in particular interactions: hydrogen bonding, Lewis acid base interactions, and non-specific interactions, generally lumped together as polarity/polarizability. However, each experimental LSER descriptor is captive to the experimental procedure used to determine it.

This raises several questions about the experimental parameters, which have been examined in a series of papers [15,16,17,18,19] in which the experimental parameters are correlated against molecular properties derived from computational chemistry. By examining several descriptors, representing the solvent or solute contribution to the same interaction, one can try to separate molecular properties that are common to descriptors and those that are incidental consequences of the experimental method.

An example of this is the analyses of Kamlet, Taft, and Abboud’s π* [3] and Reichardt and Dimroth’s ET(30) [7] parameter, both nominally measures of solvent polarity/polarizability. It is found that these share three contributions, each increasing with increasing molecular dipole moment and quadrupolar amplitude but decreasing with increasing molecular polarizability [16,17,18] (see Table 5 and Sect. 3.3). There are also contributions that are not shared: (i) a strong dependence of ET(30) on the charge of the most positive hydrogen atom of the solvent molecule, reflecting the well-known sensitivity of the probe molecule to hydrogen bonding at the pendant oxygen [2, 17, 18] and (ii) for π*, a dependence on the energy of the electron donor orbital. It is reasonable to consider the last two dependences to be incidental, resulting from the choices of probe molecules.

This leaves the common contributions: (i) from the solvent molecular dipole moment and quadrupolar amplitude, which could reasonably be attributed to the solvent polarity and (ii) the negative contribution from the solvent polarizability. The differences in the signs for the polarity and polarizability contributions simply reflect the fact that increased polarity stabilizes the form of the probe with the greater charge separation while increased polarizability stabilizes the less charged form. However, it also points to a problem with the use of these descriptors, in that polarity and polarizability affect both π* and ET(30) in opposite directions.

The approach adopted in these studies is straightforward. The gas phase structure of each solute or solvent molecule is optimized and seven calculated properties are recovered; these are the partial charges on the most negative atom and on the most positive hydrogen atom, the energies of electron donor and acceptor orbitals and the molecular dipole moment, quadrupolar amplitude and polarizability. The solvent or solute parameters are regressed against measures of the molecular properties and the relative contributions of the different properties to the experimental parameter are recovered.

In this case, the parameters, A and S are the measures of the hydrogen bond acidity and polarity/polarizability of the molecules as solutes rather than as solvents.

2 Computational Details

The procedure used has been described in detail previously and is only outlined here.

Simply, a linear relationship between the experimental parameter, P, and the molecular properties is assumed, giving Eq. 2,

$$ P = P^{0} + \sum {a_{i} Q_{i} } $$
(2)

where P0 is the value of the parameter when all of the aiQi terms are zero, Qi is the normalized descriptor providing a measure of individual molecular properties and the ai reflects the response of P to the molecular property.

The molecular properties considered are the partial charges on the most negative atom and the most positive hydrogen atom of the molecule, the energies of electron donor and acceptor orbitals and the molecular polarizability, dipole moment, and quadrupolar amplitudeFootnote 1.

Calculated molecular properties depend both on the calculation method and on the basis set used; thus, calculations were calculated by both ab initio (Hartree Fock) and density functional theory (using the B3LYP functional). Given the form of Eq. 2, it is the variation in the calculated molecular properties, rather than the actual values that are important. For the molecular polarizabilities, dipole moments and quadrupolar amplitudes, plots of values calculated using the two methods against each other are linear, so that the choice of method is immaterial. This is not true for the orbital energies where there is considerable scatter in these plots [15].

The situation with regard to the partial charges on atoms is more complicated, since these are not quantum mechanical observables and have to be estimated using a model that assigns electron charge density to individual atoms. This was considered in some detail previously [15, 16, 19]. Previously [16], charges based on Hirshfeld’s model [20], the CM5 model [21] and natural bond order, NBO, model [22] were considered. It was found that the Hirshfeld and NBO models gave very similar results. In considering the Abraham hydrogen bond acceptor parameter, B [19], it became clear that the CM5 model overestimated the partial charges on nitrogen atoms, for amides at least, and so, in this paper, we consider only charges recovered using Hirshfeld’s model. It should be pointed out that in previous studies, the analyses based on charges from the three different models gave similar results [16].

Calculations were carried out using both the Hartree–Fock and density functional (B3LYP functional) methods and the 6-311G + (3df,2p) basis set. The Gaussian 09 software suite was used [23].The full list of calculated properties is provided in the Supplementary Material.

3 Analysis and Discussion

3.1 Procedure

All data for A and S were taken from [4].

To allow comparison of the contributions from different molecular properties, it is convenient to have a normalized set of solvent descriptors and as before [15], the molecular descriptors were calculated as

$$ Q_{X} = \frac{{\left( {X_{{{\text{max}}}} - X} \right)}}{{\left( {X_{{{\text{max}}}} - X_{{{\text{min}}}} } \right)}} $$
(3)

where X represents the molecular property and the subscripts max and min refer to the maximum and minimum calculated values of X (note that for the negative charge Xmax is the largest negative charge, for example). For all properties except the orbital energies, the Xmin values were set to zero rather than to the lowest recovered value.

Normalizing the solute descriptors provides two advantages: firstly, the coefficients, ai, of Eq. 2 indicate the relative contributions from the different molecular properties and, secondly, the QX values are dimensionless and so independent of the units of the calculated molecular properties.

Equation 2 is a simple multivariable regression. However, the statistical tools normally used to judge the quality of the fitting are not applicable. The difficulty is that these assume a normal distribution of deviations. Here, this is not the case. Both the calculated properties and the experimental parameters are relatively constant for compounds with the same functional group, so that the deviations are non-statistical. The qualities of the fits were assessed solely in terms of the standard deviations between the calculated and experimental data.

3.2 Abraham’s A Parameter, a Measure of Hydrogen Bond Donor Acidity

The Abraham hydrogen bond acidity and basicity parameters were originally determined from the log10 K values for the hydrogen bond formation [46]:

$$ {\text{AH }} + {\text{ B}} \rightleftharpoons {\text{AH}}\cdot\cdot\cdot\cdot{\text{B}} $$
(4)

where AH and B represent an acid and base and the reaction is carried out in CCl4, which was taken to be an inert solvent in the sense that it doesn’t hydrogen bond strongly with AH or B. The log10 K values for series of acids against different reference bases were used to determine solute hydrogen bond acidities, A, from the linear plots of log10 K values for different reference bases against each other [4, 5].

Analysis of the A values was straightforward and showed that the only molecular property that correlated with A was the charge on the most positive hydrogen atom in the molecule. The results of the regression are listed in Table 1 for charges calculated using Hirshfeld’s model for molecules optimized by density functional and Hartree–Fock methods. Also shown in Table 1 are the regression results for Kamlet and Taft’s α, which is also a measure of hydrogen bond acidity. Again, only the charge on the most positively charged hydrogen atom was found to correlate with α. In carrying out the correlations, only substances having non-zero A or α values were included in the regression.

Table 1 Values of the coefficients of Eq. 2 and standard deviation between calculated and experimental for Abraham A and Kamlet–Taft α valuesa

Figure 1 shows plots of the calculated values of A and α against the experimental values.

Fig. 1
figure 1

Plot of (a) Abraham A values calculated using Eq. 2 and the coefficients listed in Table 1 for Hartree–Fock calculations. Symbols: orange circles, alkanes; orange squares, alkenes; orange triangles, alkynes; orange diamonds, halo alkanes; blue triangles, aldehydes; blue squares, ketones; blue circles, cyano compounds; blue diamonds, nitro compounds; brown circles, carboxylic acids; brown triangles, esters; green triangles, alcohols; green circles, aromatic alcohols; green diamonds, thiols; purple circles, primary amines; purple squares, secondary and tertiary amines; purple triangles, amides; red circles, ethers; light blue circles, aromatics; light blue triangles, chloro benzenes; light blue diamonds, bromo benzenes; yellow squares, dialkyl sulfides; red circles, ethers. The dashed lines are ± σ (Color figure online)

A and α differ from other experimental parameters, such as those for hydrogen bond acceptor basicity or polarity/polarizability, in that the majority of substances have zero values. Thus, in effect, there needs to be a minimum partial charge on the most positive hydrogen atom before A, and in effect, the K of reaction 4, becomes non-zero. This is clear from plots of A or α against q+, the partial charge on the most positive hydrogen atom (see Fig. S1 in the Supplementary Material).

Both A and α are strongly influenced by steric effects. Thus, for example, the four aromatic alcohol A values that lie to the left of the error channel (calculated A value less than the experimental value) are, from left to right, those for 2-nitro-, 2-methoxy-, 2-chloro- and 2-bromo-phenol. Those for the 3- and 4- substituted methoxy, chloro and bromo substituted phenols lie within the error channel, as do those for 2-, 3-, and 4-fluorophenol, while those for 3- and 4-nitro-phenol lie marginally to the right of the channel. The effect of 2 substitution on the phenol might reflect simple crowding in the vicinity of the hydroxyl group or, possibly, intramolecular hydrogen bonding between the OH group and the substituent. The fact that the observed effect occurs in the order 2-fluoro-phenol << 2-chloro-phenol < 2-bromo-phenol argues for steric crowding.

The values for two acids lie well to the right of the channel; these are for di- and tri-chloroacetic acid. The value for chloroacetic acid lies just at the right margin of the channel. In these cases, the experimental A is greater than that calculated using Eq. 2.

Similar patterns are observed for the α values plotted in Fig. 1b. Thus, the alcohols lying to the left of the error channel include t-butanol and t-pentanol, for which steric hindrance of access to the OH hydrogen atom is expected. The other alcohols lying to the left contain aromatic rings: benzyl alcohol, 2-phenylethanol and 3-phenylpropan-1-ol. it isn’t clear whether these reflect steric effects or relate to the presence of the benzene ring but it can be noted that phenol and 3-chloro phenol are among the alcohols lying to the right of the error channel.

The other alcohols lying to the right are the halogenated 2-chloroethanol, trifluoroethanol, hexafluoropropan-2-ol, and the triol, glycerol.

3.3 Abraham’s S Parameter, a Measure of Solute Polarity/Polarizability

Unusually, Abraham’s S parameter was not determined from solution phase experiments but from gas chromatographic experiments using non-polar columns.

Initial correlations of the S values of non-aromatic molecules indicated that two molecular properties correlated with S: the charge on the most negative atom and the molecular dipole. There was also a weak correlation with the energy of the acceptor orbital (taken as the LUMO) but this made only a marginal improvement to the agreement between the experimental and calculated values (see Table 2). Values for aromatic compounds calculated using the parameters recovered from these analyses lay to the right of the error channel; that is, the calculated values were significantly smaller than the experimental.

Table 2 Values of the coefficients of Eq. 2 and standard deviation between calculated and experimental for Abraham S valuesa,b

It was found that the inclusion of the polarizabilities of single ring aromatic compounds brought most of their calculated S values into line with those of non-aromatic compounds. The coefficients recovered from the analyses are reported in Table 2 and the calculated and experimental values are compared, for results based on Hartree–Fock calculations, in Fig. 2; the plot for the results based on the density functional calculations is similar.

Fig. 2
figure 2

Plot of Kamlet–Taft α values, calculated using Eq. 2 and the coefficients listed in Table 1 for Hartree–Fock calculations. Symbols: orange circles, alkanes; light blue circles, aromatics; green circles, alcohols; red circles, ethers; brown circles, carboxylic acids; blue triangles, esters; blue squares, ketones; purple circles, amines; blue circles, cyano compounds; purple triangles, amides; orange triangles, sulfides, sulfoxides, phosphates, and nitro compounds. The dashed lines are ± σ (Color figure online)

The experimental S values for aromatic compounds with more than one ring are greater than those calculated using the coefficients in Table 2. This can be seen in Table 3, which lists the experimental S values for compounds with one, two, and three aromatic rings. Thus, the values for compounds with two aromatic rings are essentially double and those of anthracene and phenanthrene are about 2.4 times those of single ring compounds, although the molecular polarizabilities are similar, Fig. 3.

Table 3 Experimental S values for aromatic substancesa
Fig. 3
figure 3

Plot of Abraham S values calculated using Eq. 2 and the coefficients listed in Tables 2 for Hartree–Fock calculations. Symbols: orange circles, alkanes; orange squares, alkenes; orange triangles, alkynes; orange diamonds, halo alkanes; blue triangles, aldehydes; blue squares, ketones; blue circles, cyano compounds; blue diamonds, nitro compounds; brown circles, carboxylic acids; brown triangles, esters; green triangles, alcohols; green circles, aromatic alcohols; green diamonds, thiols; purple circles, primary amines; purple squares, secondary and tertiary amines; purple triangles, amides; red circles, ethers; light blue circles, single ring aromatics; light blue triangles, chloro benzenes; light blue diamonds, bromo benzenes; yellow squares, dialkyl sulfides; red circles, ethers. The dashed lines are ± σ (Color figure online)

The experimental S values of substituted phenols also show steric effects, with the S values of 2- substituted phenols being systematically lower than those for the 3- and 4- substituted phenols (see Table 4). Moreover, the S values of the 2 substituted phenols decrease in the order: 2-fluoro- > 2-chloro- ≥ 2-brono- > 2-methoxy- > 2-nitrophenol.

Table 4 Experimental S values for substituted phenolsa

Thus, S values show strong dependences on the solute dipole moment, on the charge of the most negative atom in the molecule, almost invariably a heteroatom, and on the polarizability of aromatic compounds with single rings but not on the polarizability of aliphatic solutes. The regressions also recover an apparent contribution from the energy of the accepter orbital but this makes a negligible contribution to improving the standard deviation and is, tentatively, taken to be an artifact.

Ultimately, the value of any parameter reflects the experimental method used to determine it. The S values were determined through gas chromatography with non-polar columns and so reflect the probability of a gas phase molecule adsorbing on the non-polar surface.

The dependence of S on the molecular polarizability is interesting. In the case of aliphatic substances, the molecular polarizability increases with the length of the molecule’s alkyl chain. However, in the gas phase, where only intramolecular interactions are available, it is likely that longer alkyl chains will coil to maximize these interactions, rather like the coiling of polymer molecules in poor solvents.

In an elastic collision with the surface of the column, a molecule is within 5 Å of the surface for less than 10−10 s,Footnote 2 which is too short a time for uncoiling of the alkyl chain. Of course, aromatic molecules remain as flat rings in the gas phase and so, depending on the geometry of the collision with the surface, are able to adsorb without molecular rearrangement. The fact that the second aromatic rings of diphenylmethane, biphenyl and naphthalene essentially double the experimental S values is consistent with the rings acting independently, the second ring essentially doubling the probability that the molecule will adsorb. The S values of anthracene and phenanthrene show smaller increases, but again, the probability of adsorption increases with the third ring.

In the cases of the basicity parameters, β, DN and B, and acidity parameters α and A, there is general agreement as to the molecular properties contributing to the parameters.

However, the coefficients recovered for the S values differ markedly from those recovered from similar analyses of Kamlet and Taft’s π* [16] and Reichardt and Dimroth’s normalized parameter [17], which are shown in Table 5.

Table 5 Values of the coefficients of Eq. 2 and standard deviation between calculated and experimental for Kamlet, Abboud, and Taft π* and Dimroth and Reichardt \(E_{{\text{T}}}^{{\text{N}}} \left( {30} \right)\) valuesa

The well-known dependence of \(E_{{\text{T}}}^{{\text{N}}} \left( {30} \right)\) on the charge on the most positive hydrogen atom of the solvent simply reflects hydrogen bonding to the pendant oxygen atom of the betaine dye [2, 17, 18]. The reason for the dependence of π* on the energy of the acceptor orbital is less clear.

The other coefficients show good qualitative agreement, both π* and \(E_{{\text{T}}}^{{\text{N}}} \left( {30} \right)\) having approximately equal positive contributions from the dipole moment and quadrupolar amplitude and a negative contribution coming from the molecular polarizability. The different signs of the coefficients simply indicate that the polarizability stabilizes the less charged form of the probe molecule while the permanent charges on the molecule, as measured by the dipole moment and quadrupolar amplitude, stabilize the more highly charged form of the probe molecule.

In contrast, S values show strong positive dependences on the solute dipole moment but not on the quadrupolar amplitude, on the polarizabilities single ring aromatic compounds but not on those of aliphatic compounds and on the charge of the most negative atom in the molecule.

While these differences result from the differences in the experimental procedure from which they are derived, they could also reflect the fact that S is explicitly a solute parameter, reflecting the properties of the isolated solute molecule, while π* and \(E_{{\text{T}}}^{{\text{N}}} \left( {30} \right)\) are the properties of the bulk solvent. Thus, for example, the dipole moment and quadrupolar amplitude are used simply as quantitative measures of the scale of charge centers imbedded in the bulk solvent, the “intensity” of embedded charges, the polarity of the solvent. In essence, the polarity represents a solvent’s ability to stabilize the solute charges through non-specific interactions. It may be that the combination of the molecular dipole moment and partial charge on the most negative atom effectively captures the scale of charges on the solute.

However, the absence of a consistent dependence on the molecular polarizability is difficult to rationalize in the context of a polarity/polarizability parameter.

4 Conclusions

The analysis of Abraham’s hydrogen bond donor parameter, A, provides a quite simple picture, where A, like Kamlet and Taft’s α, depends on the charge of the most positive hydrogen atom of the molecule modified by steric effects.

The analysis for the S parameter presents a more complex situation, with dependences on the molecular dipole moment, charge on the most negative atom and, for single ring aromatic molecules, on the molecular polarizability. These differ markedly from the dependences found for both Kamlet and Taft’s π* and Reichardt’s \(E_{{\text{T}}}^{{\text{N}}} \left( {30} \right)\).