Quantum mechanical polar surface area

A correlation has been established between the absorbed fraction of training-set molecules after oral administration in humans and the Quantum Mechanical Polar Surface Area (QMPSA). This correlation holds for the QMPSA calculated with structures where carboxyl groups are deprotonated. The correlation of the absorbed fraction and the QMPSA calculated on the neutral gas phase optimized structures is much less pronounced. This suggests that the absorption process is mainly determined by polar interactions of the drug molecules in water solution. Rules are given to derive the optimal polar/apolar ranges of the electrostatic potential.


Introduction
The polar surface area (PSA) has been used successfully to predict the absorption of drugs [1]. The polar surface area is defined as the combined surface area belonging to oxygen and nitrogen atoms and hydrogen atoms bound to these electronegative atoms. Methods to improve the correlation between polar surface area and absorption of drugs evolved in the years thereafter. The Dynamic Polar Surface Area (DPSA) is derived from Boltzmann-averaged ensembles of low energy molecular conformations [2]. The Topological Polar Surface Area (TPSA) is fragment-based methodology which derived standardized contributions to the molecular polar surface area from functional groups and atom types [3]. Various protocols have been reported to calculate the PSA on different surfaces (van der Waals [1], Connolly, or solvent accessible surface [4].
The term polar surface area suggests that the absorption is related to the physical interaction of surfaces through their electrostatic potential. In this work we present a study of the correlation of the quantum mechanical electrostatic potential and the absorption of drugs in humans. In the original work by Palm et al. [1] the surface was constructed by intersecting atomic spheres defined by van der Waals radii. In line with our quantum mechanical approach in this work the electrostatic potential will be calculated on a surface with constant electron density or isodensity surface.
The algorithm by Palm et al. was incorporated to our MOLDEN molecular modeling package [5] for comparison. Choices have to be made about which value of the electron density the isodensity surface is most suited to calculate the electrostatic potential.
Secondly, a range of the electrostatic potential has to be defined as polar and a complementary range as apolar. The sum of all parts of the isodensity surface with an electrostatic potential in the polar range is then defined as the polar surface area.
Below a concise summary on how these structures were calculated and used in this work. Low-energy conformations for each molecule in the training-set were obtained from Monte Carlo multiple minimum (MCMM) searching, using the OPLS-AA force field.
A validation set of compounds with absorbance data in humans, was obtained from J.Kelder et al. [11]. The validation set will be used to determine whether the optimized ranges of the electrostatic potential for polar and apolar surface are also valid for an independent set of molecules.
Single-point energy calculations were performed with the optimized geometries by the program Gamess-US [12] at the B3LYP/6-31G** level, to generate the wave function files required for the calculations of the electron density and electrostatic potential on a three dimensional grid or cube file with the Molden (version 4.7) program [5].
These cube files are subsequently used to map the electrostatic potential onto an isodensity surface with the Molden program. The isodensity surface is represented as a collection of triangles, calculated with the marching cube algorithm implemented in Molden. The electrostatic potential of the vertices of each triangle is interpolated from the potential of the eight grid points of the cube marching [13] over the three dimensional grid. The polar surface area is calculated as the sum of the triangular areas with the potential in the polar range. When not all vertices are in the polar range the triangles are subdivided into four smaller triangles. This process is repeated until all vertices are either in the polar range or all in the apolar range.
Sigmoidal fits between the QMPSA and the fraction absorbed in humans (FA) were performed using the four parameter Weibull equation: Where a, b, c and d are parameters to be fitted.

Isodensity surfaces
Varying the value of the electron density of the isodensity we established that a value of 0.0005 electrons/bohr 3 would give a surface most compatible with the van der waals surface used by Palm et al. [1]. Table 1 shows the total surface area for the training-set molecules with both methods. The root mean square deviation is 9 Å 2 which is around 3%. The electron density that best matches the van der Waals surface is relatively low. Quantum mechanical methods optimize the electron density with respect to the energy.
An artifact of these methods is that the contribution to the electron density by energy rich inner shell electrons/ orbitals is optimized at the expense of that of the outer shell and valence electrons/orbitals, when not using a complete basis-set [14]. In the same spirit it can be argued that the electron density at locations that contribute higher to the energy is optimized at the expense of quality of the electron density at locations that contribute less to the energy. In order to avoid/evaluate this complication the Quantum Mechanical Polar Surface Area (QMPSA) will also be evaluated on isodensity surfaces with a higher electron density.
Polar and apolar range of electrostatic potential Experimenting with upper and lower bound of the electrostatic potential defined as apolar, the following observations were made. Choosing the upper bound of the apolar electrostatic potential (ESP apolar,high ) too low results in hydrogens connected to phenyl rings to contribute to polar surface with positive electrostatic potential (see Fig. 1 and Fig. 2). Conversely, choosing the lower bound of the apolar potential (ESP apolar,low ) too high, results in the electron density above and under the phenyl rings to contribute to the polar surface area with negative electrostatic potential. Phenyl rings constitute a major part of the isodensity surface, and since these atoms do not contribute Correlation between QMPSA and the absorbed fraction of training-set molecules Table 2 shows the calculated QMPSA and the absorbed fraction (FA) of traning-set molecules [1], on the isodensity surfaces with density values 0.01 and 0.0005 e/Bohr 3 . Upper and lower bound of the polar electrostatic potential were chosen according to the rules set out in the previous section and were optimized to yield the highest correlation coefficient between the QMPSA and the absorbed fraction of the training-set molecules.
The correlation between the absorbed fraction of the training-set molecules and the QMPSA is relatively low compared to the reported sigmoidal correlation between topological polar surface area and the absorbed fraction. Correlation coefficients for a linear fit are 0.68 and 0.30 for density values 0.01 and 0.0005 e/bohr 3 respectively. Correlation coefficients for a sigmoidal fit are slightly better: 0.74 and 0.36 for density values 0.01 and 0.0005 e/bohr 3 respectively.
A graphical inspection of the QMPSA revealed the reason for the often relatively low values of the QMPSA with respect to the topological polar surface area. Figure 3 shows the electrostatic potential mapped onto the isodensity surface for the training-set molecule sulfasalazine. Figure 4 shows the structural formula of sulfasalazine. QMPSA surface areas are marked with the red and blue colors. Red and blue represent polar surface areas with respectively positive and negative electrostatic potentials. The hydrogen of hydroxyl group attached to the phenyl ring points towards the oxygen of the carboxyl group in the optimized sulfasalazine structure. The positive electrostatic potential exerted by the hydroxyl hydrogen cancels out the negative electrostatic potential exerted by the carboxyl oxygen. The resultant potential has a low absolute value and is therefore classified as an apolar potential.
In general optimized structures in the gas phase will tend to have their electronegative atoms (O, N) oriented towards electropositive counterparts (H), whereas molecules in a polar solvent such as water will tend to have both their electronegative and electropositive atoms accessible for interaction with the solvent.
The absorbed fraction pertains to the fraction of molecules in solution, absorbed into the apolar membranes of the gut. Gas phase optimized structures are therefore best suited to represent the absorbed state of the training-set molecules. The water solved state of the molecules can probably best be represented by taking into account the neutral species and the deprotonated species, with their electronegative atoms accessible to the solvent.

Influence of the protonation state of acids
The carboxyl group of the molecules in our training-set can lose their proton depending on the pH with respect to the acid's pK a . In our training-set three molecules contain a carboxyl group: ciprofloxacin, sulfasalazine and olsalazine. The latter two have pK a 's such that the they are dissociated at the pH of the gut (pH = 5.7-6.6 [15]). Ciprofloxacin however has a pK a that falls in the pH range of the gut (pK a = 6.09 [16]). We assume therefore that half of the ciprofloxacin molecules are dissociated and the other half are not. The QMPSA for these three molecules should (also) be calculated on the deprotonated species. The QMPSA of ions is dominated by the charge center and is therefore much larger than their neutral counterparts. For sulfasalazine for example the QMPSA for the anion is 198.7 Å 2 versus 56.0 Å 2 for the neutral species. In aqueous solution however, counter ions are always present at some distance.  Figure 5 shows the electrostatic potential mapped onto the isodensity surface for the deprotonated training-set molecule sulfasalazine. Table 3 shows the calculated QMPSA and the absorbed fraction (FA) of training-set molecules with carboxyl groups deprotonated, on the isodensity surfaces with density values 0.01 and 0.0005 e/bohr 3 . The correlation between the absorbed fraction of the training-set molecules and the QMPSA is significantly better compared to that of the neutral gas phase optimized structures.
Correlation coefficients for a linear fit are 0.87 and 0.84 for density values 0.01 and 0.0005 e/bohr 3 respectively. Correlation coefficients for a sigmoidal fit are also better:   Figure 6 and Fig. 7 show respectively the linear and sigmoidal correlation between the absorbed fraction of training-set molecules with carboxyl groups deprotonated, and the QMPSA calculated on the isodensity surface with density value 0.01 e/bohr 3 .
Correlation between QMPSA and the absorbed fraction of validation-set molecules Table 4 shows the calculated QMPSA and the absorbed fraction (FA) of validation-set molecules with deprotonated carboxyl groups, on the isodensity surfaces with density values 0.01 and 0.0005 e/bohr 3 . Correlation coefficients for a linear fit are 0.96 and 0.95 for density values 0.01 and 0.0005 e/bohr 3 respectively. Correlation coefficients for a sigmoidal fit are slightly better and worse: 0.98 and 0.87 for density values 0.01 and 0.0005 e/bohr 3 respectively. Figure 8 and Fig. 9 show respectively the linear and sigmoidal correlation between the absorbed fraction of combined validation-and training-set molecules and the QMPSA calculated on the isodensity surface with density value 0.01 e/bohr 3 , with correlation coefficients of 0.86 and 0.92 respectively. The combined set shows an equally good sigmoidal correlation compared to the trainingset alone (0.92 versus 0.92). Although the isodensity

Relation between gas phase and solvent optimised structures
To assess the influence of gas phase versus solvent optimized structures on the correlation between QMPSA and absorbed fraction in humans, the training set of molecules was optimized with an explicit water for each hydrogen bond donor using the Polarizable Continuum solvent Model (PCM) [17]. The 4-31G* basis set was used at the Hartree-Fock level of theory. After optimization the explicit waters are removed and the QMPSA is calculated at the B3LYP/6-31G** level of theory at the PCM optimized geometries. Table 5 shows the results. Comparing with the gas phase approach, the QMPSA changes are small (RMSD 1.73 Å 2 ). Correlation coefficients for a linear and sigmoidal fit are 0.86 and 0.91 respectively for isodensity values 0.01 e/bohr 3 (versus 0.87 and 0.92 respectively for gas phase approach).
The correlation between QMPSA and absorbed fraction in humans is expected to improve when using ensembles of low energy molecular conformations as in the Dynamic Polar Surface Area method [2], but was not further investigated. Figure 10 mannitol with six explicit waters optimized with the PCM solvent model at the Hartree-Fock level of theory with the 4-31G* basis set.
QMPSA basis set and level of theory dependency Table 6 shows that the QMPSA is in general weakly dependent on the used basis set. The root mean square deviation for nordiazepam, tranexamic and sulfasalazine is 0.14, 0.69 and 3.96 respectively (1, 2 and 5%) over the employed basis sets. The compounds were chosen to represent the apolar, medium polar to polar spectrum. The RMSD increases with the polarity of the compounds.
Going from the Hartree-Fock level of theory to B3LYP, the QMPSA decreases by 10%(see Table 5 and Table 6 basis set 6-31G**). This is not surprising since the Hartree-Fock method is known to overestimate the polarity [18].

Conclusions
A good correlation has been established between the absorbed fraction of training-set molecules after oral administration in humans and the Quantum Mechanical Polar Surface Area (QMPSA). This correlation holds for the QMPSA calculated with structures where the carboxyl groups are deprotonated. The correlation of the absorbed fraction and the QMPSA calculated on the gas phase optimized structures is much less pronounced. This suggests that the absorption process is mainly determined by polar interactions of the molecules in water solution.
The QMPSA is weakly dependent on the used basis set and drops 10% on going from Hartree-Fock to B3LYP.
Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.