QTAIM based descriptors for the classification of acrylates

Acrylates are used in cosmetics, orthopedics, paints, coatings, adhesives, textiles, and biomedical applications such as contact lenses and bone cements. However, some acrylates are mutagenic and the aim of this article is to explain the mutagenicity in terms of the atomic population redistribution in the molecule using two new descriptors which are based on atomic populations framed in the quantum theory of atoms in molecules. They describe the electron-withdrawing effect of a group of atoms in a molecule. The descriptors consider substituents of prop-2-enoates, the number of the acrolein units and the electrophilicity. The cluster analysis using these descriptors allows to classify acrylates in terms of the number of acrolein backbones and the type of the substituent group. Five main groups can be distinguished: monoacrylates with monomethacrylates, diacrylates with dimethacrylates, triacrylates, trimethacrylate and monoacrylates with electron-rich substituents. The substituents of mutagenic acrylates are electron withdrawing. This makes the acrolein backbone β-carbon more electrophilic and the molecule more reactive.


Introduction
Acrylates are the esters of acrylic acid.Their IUPAC name is prop-2-enoates and they are extensively used in the industry [1].Despite their extensive use, there are few reported studies related to a prediction of their properties.Yu et al. proposed an artificial neural network (ANN) model to predict the reactivity of 34 acrylate monomers in radical copolymerization.This model was build on quantum-mechanical descriptors such as Mulliken charges and highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) energies [2].In a study conducted by Perez-Garrido et al, a quantitative structure-activity relationship (QSAR) model was proposed to predict the mutagenicity of a set of more than 100 acrylates [3].This model was build on the basis of more than 1000 descriptors encoded in DRAGON software [4].A study of Tanis et al., focused on the chemical reactivity of the N-(4-nitrophenyl) acrylamide (N4PA) with nucleic acids.This reactivity was studied with the quantum mechanical descriptors such as the electronic energy, the electronegativity, the chemical hardness, the chemical potential and HOMO-LUMO energies [5].Furuhama et al. studied acrylates in order to predict their ecotoxicity employing a QSAR model and using Gasteiger's partial equalization of orbital electronegativity (PEOE) as descriptors [6].To determine the toxicity of methacrylates Ishihara et al. used quantum chemical descriptors such as HOMO, LUMO, electronegativity and partition coefficient (logP) in a QSAR model.They found that their model may help them to reveal the toxic potentials of new medical and dental materials, thereby facilitating the synthesis of less toxic acrylates [7].These authors also estimated the hemolytic activity of aliphatic and aromatic methacrylates using the following quantum chemical descriptors such as electron affinity, ionization potential, HOMO and LUMO energies, and the partition coefficient (logP) [8].Liu et al studied a set of polymethacrylates to reproduce physical properties such as molar volume at room temperature, refractive index and glass transition temperature using descriptors such as the side chain length, polarizability and HOMO energy.Their model predicted reasonably well the properties of acrylate polymers [9].Acrylates have also been classified using descriptors such as the number of acrylate groups, the calculated value of LogP, and the number of molecular paths [10].In this study, 143 acrylates were clustered in five 131 Page 2 of 11 groups, differing in their LogP values, their halogenation and their degree of branching.
No descriptor has been proposed so far to understand the atomic population redistribution of acrylates upon the interchange of ester substituents and the relation with their mutagenicity.Therefore, we propose two descriptors based on the framework of the quantum theory of atoms in molecules (QTAIM) [11][12][13][14].The motivation for the development of these descriptors was to find out how substituent groups (ester group and substituent in α position) affect the atomic population of the acrolein backbone, and whether this allows to distinguish between different types of acrylates.The proposed descriptors are based on (i) the proportion between the QTAIM atomic population of the fragments and the acrolein backbone and (ii) the concept of electronegativity and electron affinity.QTAIM based descriptors have been proposed previously, however, the focus in that study was on electron delocalization [15].The present work is focused on the atomic population redistribution upon substituent interchange on acrylates and its relation with the mutagenicity as the capacity to induce mutations and cancer [16].Mutagenicity can be related to the Michael reactivity [17].
This paper contains the following sections: (i.) Atomic populations and descriptors, the relation between the electron density redistribution upon the addition of one electron and the values of both descriptors are presented.(ii.)Effects of the substituent in X position.(iii.)Effects of the substituent in Y position and (iv.)Hierarchical clustering based on the proposed descriptors.

The selection of the molecules
The U.S. Environmental Protection Agency and the International Agency of Research on Cancer has classified some acrylates as a possible human carcinogens.For example, the exposure of some acrylates to human body tissues such as skin, throat or eye have been reported to induce serious health consequences such as reproductive toxicity, cancer, neurological damage, organ system toxicity and cellular damage [1].The main interest in this study is to propose two descriptors describing the electron distribution in acrylates and its relation with mutagenicity.Therefore, we selected 65 acrylates from various databases [18][19][20] and references [21][22][23][24][25][26][27][28] of which 8 are mutagenic [18,21,22].Among the 65 acrylates there are 25 acrylates, 19 methacrylates, 8 diacrylates, 10 dimethacrylates, 2 triacrylates and 1 trimethacrylate.The reported mutagenicity (Table 1) was evaluated with the Ames Test using the Salmonella typhimurium TA100 strain assays.

Structure of acrylates
Acrylates are α,β-unsaturated carbonyl compounds, and may have different substituent groups in positions X and Y as shown in Scheme 1.
The mesomeric effect described in terms of electron withdrawal depends on the strength of an atom in a molecule to attract electrons, called electronegativity by Linus Pauling [32].This concept can be extended to groups of atoms, functional groups, or substituents to attract electrons from their molecular counterpart.Based on this concept, we propose the two QTAIM descriptors in this article.

Computational methods
A conformational study was performed for each acrylate to select the most stable conformer.This study was done with the Conformer-Rotamer Ensemble Sampling Tool (CREST) [33].The conformational study was performed with the GFN2-xTB [34] function with an energy threshold of 10 kcal/mol.
The most stable conformer was selected and reoptimized with the M06-2X [35] density functional and the 6-311G(d,p) valence double zeta basis set [36].Atom pairwise dispersion corrections with zero damping were taken into account [37].All calculations were performed with the ORCA quantum chemistry program package version 5.0 [38,39].The geometries of the most stable conformer for each acrylate have in common that the orientation of the double bond with the carbonyl bond is synperiplanar [40].For each minimized geometry the wave function was obtained and analyzed using QTAIM [11,12] with the AIMALL software [41].
This analysis allows to obtain the atomic basins and their corresponding properties, such as the atomic population, N(Ω).Integration errors were estimated through differences between molecular properties and those obtained by the summation of N(Ω).Their absolute values were always smaller than 2.00 × 10 -3 a.u.

QTAIM based descriptors
We propose an arithmetic proportion that relates the electron-withdrawing effect of a group of atoms in a molecule.QTAIM allows partitioning (Eq. 1) of any molecular property P of a specific molecular system S into their atomic contributions P(Ω) where Ω denotes an atom in the system [11][12][13][14].
(1) P(S) = ∑ Ω⊂S P(Ω) An application of this partitioning is the molecular population N which can be split into atomic contributions N(Ω).The electronegativity (EN) of substituent groups in an acrylate (Eq.2) can be related to the quotient of atom populations N(Ω) between the atoms of substituent groups and those of the acrolein backbone A.
The electron affinity (EA) is the variation of the energy upon the addition of an electron to a neutral atom or molecule in the gas phase [42]. (2) Within the QTAIM framework, the electron affinity can be decomposed into atomic contributions, denoted by Ω: The variation of atomic population N(Ω) upon addition of an electron can be written in terms of the EA: EN(Ω) and EA(Ω) can be merged into one equation, leading to the definition of the molecular descriptor D: The external sum in Eq. 6 runs over all acrolein groups (R is the total number of acrolein substituents).In the case of i-th acrolein backbone, the numerator indicates the sum of atomic population differences of those atoms that do not belong to this structure.Among those atoms, it includes the remaining atoms of all other acrolein backbones and substituent fragments (Fig. 1).The denominator is the sum over the atomic population differences of all atoms that belong to i-th acrolein backbone.
(4)  In Eq. 7, we define the descriptor D ∩ .The numerator of this descriptor is the sum of the atomic population difference between the acrylate anion and the neutral acrylate of the atoms that do not belong to the acrolein backbone and the denominator is the sum of the same difference, but for the acrolein backbone atoms (Fig. 2).In the case of a molecule with two acrolein backbones and from the definition of D, this descriptor can be understood as the intersection of the first fragment with the second fragment (Fig. 1).All fragments that neither belong to the first fragment nor the second fragment are the substituent groups in positions X and Y in each acrolein backbone.Therefore, the intersected fragments are the acrolein backbones.
In the following, we will discuss the differences between the two descriptors in more detail.The aim of defining two descriptors is to understand how branched an acrylate is and how the atomic population changes depending on the substituents in X and Y positions (Scheme 1).The descriptor D indicates how branched the molecule is.Larger values of this descriptor correspond to a branched molecule with several acrolein backbones, while smaller values indicate that the molecule is not branched and has only one acrolein backbone.Larger values of the D ∩ descriptor correspond to larger atomic population on the substituents.

Clustering analysis
A correlation analysis revealed that D and D ∩ are not correlated (correlation coefficient of -0.038).In order to test the performance of the proposed descriptors, we applied them to a series of α-substituted methyl acrylates with the following substituent groups: -F, -Cl, -H, -CN, -CH 3 , -NO 2 .These acrylates were previously studied by Giraldo et al [43].To this end, the obtained values of descriptors D and D ∩ for the 65 acrylates were normalized using Eq.8: where X is the referenced value, Min and Max are the minimum and maximum values and X′ is the normalized value ( 7) . Using this normalized data, we carried out a cluster analysis using five different distance definitions: 1. Euclidean, 2. Manhattan, 3. Minkowski, 4. Maximum, and 5. Canberra and eight grouping methodologies: i. Single, ii.Complete linkage, iii.Average, iv.McQuitty, v. Centroid, vi.Ward.D, vii.Ward.D2 and viii.Median [44][45][46].We obtained 40 dendrograms, eight for each similarity function, using the software R [47] and the dendextend package [48].The best dendrogram was selected based on the cophenetic correlation [49], which was obtained with the Minkowski distance definition and the average grouping methodology.

Atomic populations and descriptors
D and D ∩ are introduced to understand the variation of the atomic population N(Ω) upon addition of one electron, and how it is affected by the substituents in X and Y positions (Scheme 1).The structures of all molecules studied herein and their name, numeration and mutagenicity are shown in Table 1.
Methyl acrylate (compound 1 in Table 1 and in Table S1) and ethyl acrylate (compound 47 in Table 1 and Table S1), both with one acrolein backbone, are taken as an example.As can be seen in Table 1, 75.9% of the electron are in the acrolein backbone for compound 1 and 75.2% for compound 47, while the rest is distributed on the substituents in X and Y positions (Scheme 1).Therefore, larger values of the D and D ∩ descriptors (Table 2) indicate a less localized atomic population at the acrolein backbone.
Both descriptor values, D and D ∩ , depend on substituents in X and Y position, and on the number of acrolein backbones in the molecule.However, the value of D is much more strongly affected by the number of acrolein backbones.For example, the value of D in compound 6 (which is a diacrylate) is larger than in compound 3 (a monoacrylate) (Table 2).On the other hand, D ∩ for compound 6 indicates that the added electron is more localized at the substituents in X and Y than for compound 3 (Table 2).The magnitude of these values reflects that the added electron in compound 3 is mainly located at the acrolein backbones rather than on the substituents (Eqs.6 and 7).

Effects of the substituent in X position
We now analyze in more detail the effect of the substituent in X position on the electronic structure.In Table 3, methyl acrylate is given as an example, and H at the α-position is replaced by the following substituents: F, Cl, CN, CH 3 and NO 2 .These molecules were chosen because their Michael reactivity has been previously analyzed by Giraldo et al [43].The atomic populations of the carbonyl group atoms increase during the addition of an electron to the neutral molecule.In the case of X = H, the carbonyl carbon has larger variations than the carbonyl oxygen (Table 3).These variations correspond to 28.8% of the overall variation.The atomic populations of the carbon-hydrogens at β-position, carbon-hydrogen at α-position, carbons at β-and α-position, and OCH 3 respectively vary by 22.5%, 11.1%, 24.6% and 12.9%.These values show that the added electron is mostly located at the carbonyl bond.
Table 3 shows the changes of the atomic populations at C β and C carbonyl .The population differences at C β increase in the order X = NO 2 < CH 3 < CN < Cl < H < F. At C carbonyl the population differences increase in the order X = NO 2 < CN < Cl < CH 3 < H < F. None of these population differences correlates well with the activation energies of the C β attack of methanethiol investigated by Giraldo et al, which increase in the order X = NO 2 < CN < H < Cl < CH 3 < F. The changes of the atomic populations of the substituent in X position increase in the order F < H < Cl < CH 3 < CN < NO 2 , indicating an increasing electron-withdrawing effect of these substituents.The population differences between C carbonyl and C β decrease accordingly.This trend shows that C β becomes more electrophilic.The values of D and D ∩ for these molecules increase in the order F < H < Cl < CH 3 < CN < NO 2 .This correlates reasonably well with the calculated activation energies in ref. [43].
Figure 3 shows that the activation energies are inversely proportional to D. Only the relation with the D descriptor is shown because D and D ∩ have the same values for mono(meth)acrylates (Table 2).This suggests that the reactivities of these molecules are correlated with the electron withdrawing effect of the substituent.This can be observed with the NO 2 functional group.The lower the activation energy, the larger is the electron withdrawing effect.In consequence, C β becomes more electrophilic.In contrast, Giraldo et al did not find any trend between the positive Fukui function of the electron density and the activation energy.They only found for NO 2 that it is the most electronwithdrawing substituent, and consequently C β has the largest electrophilic character with this substituent [43].

Effects of the substituent in Y position
The diversity of the substituents is broader in Y position than in X position (Scheme 1 and Table 1).We have analyzed acrylates, methacrylates, diacrylates and dimethacrylates with different n-alkyl substituents in Y position to evaluate the correlation with the descriptors proposed in this paper.Substituents with heteroatoms, unsaturated or branched substituents are not included here, because they do not exhibit clear trends.
Figure 4 shows that both descriptors differentiate between acrylates and methacrylates, and diacrylates and dimethacrylates.D and D ∩ correlate well with the number of carbon atoms in Y position.Moreover, the values of D are larger in diacrylates and dimethacrylates than in their monosubstituted counterparts.This difference is about 3.4 units.In the case of the monosubstituted molecules, the values of these descriptors are very similar, see Fig.  2).In the same vein, methacrylates have larger values than acrylates by about 0.1 units for both descriptors.These trends show that N(Ω) of the acrolein backbone decreases and is transferred Equations for the adjusted lines: a y = 0.008x + 0.412 R 2 = 0.731 (methacrylates) y = 0.004x + 0.318, R 2 = 0.730 (acrylates); b y = 0.108x + 3.811, R 2 = 0.991 (dimethacrylates), y = 0.128x + 3.180, R 2 = 0.465 (diacrylates); c y = 0.026x + 0.454, R 2 = 0.990 (dimethacrylates) and y = 0.022x + 0.305, R 2 = 0.792 (diacrylates).The number on each data point corresponds to the molecule number presented in Table 1 to the n-alkyl group in Y position as the size of the substituent is increased.
It is important to stress out that the clear separation between different classes of acrylates, and the correlation with the number of carbon atoms in the substituent in Y position, is only found for a subset of the acrylates investigated in this study, namely those with aliphatic and unbranched substituents.

Hierarchical clustering
As a further test for descriptors D and D ∩ (Table 2), a hierarchical cluster analysis was carried out with the set of acrylates shown in Table 1.The best dendrogram was obtained using the Minkowski distance as the similarity function and the grouping methodology of average.
The dendrogram was constructed using the D and D ∩ descriptors (Table 2).In the dendrogram, the main observed groups differ because of the D descriptor.But the order within each group is due to D ∩ .Moreover, in Fig. 5, five main clusters of acrylates can be distinguished: cluster 1 and cluster 2 contain triacrylate and trimethacrylate.Moreover, compounds 13, 15 and 40 contain three acrolein backbones, 13 has a pentaerythritol substituent, and compounds 15 and 40 have a trimethylolpropane substituent (Table 1 and  S1).In cluster 3 acrylates with electron-rich substituents in position Y, such as Br or bromobenzyl are found.Cluster 4 includes only monoacrylates and monomethacrylates.Cluster 5 includes diacrylates and dimethacrylates.This shows that our descriptors can classify acrylates based on their chemical nature.Moreover, contiguously grouped molecules indicate that their substituent group withdraws almost the same atomic population from the acrolein backbone.In the same vein, molecules in cluster 3 have substituents that most strongly withdraw atomic population from the acrolein backbone as it can be seen at their D ∩ values (Table 2).
The mutagenic compounds (17,19,22,38,44,46, 63 and 64) and compounds without reported mutagenicity (56, 57, 58, 59, 60, 61, 62 and 65) are in clusters 3 and 4 (Table S2).In cluster 3 are the compounds with the largest D ∩ values (Table 2).This means that the atomic populations of molecules 56 and 57 are localized on the substituents rather than in the acrolein backbone.Moreover, the largest D ∩ value for a mutagenic acrylate correspond to compound 38 (Table 1 and 2) which suggests that compounds 56 and 57 may be mutagenic too and this mutagenicity activity should be experimentally evaluated elsewhere.

Conclusions
Two descriptors, D and D ∩ , based on QTAIM atomic populations N(Ω) were proposed and applied to a set of 65 acrylates.These two descriptors D and D ∩ , based on QTAIM atomic populations, N(Ω), consider the electronwithdrawing effect of the acrolein moiety.The descriptor D is more sensitive to differences in the type of the acrylate than D ∩ .Applied to a subset of compounds including only aliphatic substituents without heteroatoms the descriptor D differentiates between mono-, di-and tri-acrolein backbones and both descriptors differentiate between acrylates and methacrylates with one or two acrolein backbones.
A cluster analysis using both descriptors shows that the molecules studied herein can be grouped into mono-, diand tri-acrolein backbones.Mono-acrylates with electronrich substituents (aromatic groups and halogens) are in a separate group.
Molecules with similar D and D ∩ values are characterized by substituent groups that withdraw a similar amount of atomic electron population from the acrolein backbone.
The proposed descriptors describe the atomic population distribution of the mutagenic acrylates and this population is localized on the substituents rather than on the acrolein backbone.
Analyzing the effect of the substituent in the α-carbon (X position) in the methyl acrylate (X: F, Cl, CN, CH 3 and NO 2 ) on the changes of their atomic populations, we noticed that these changes increase in the order F < H < Cl < CH 3 < CN < NO 2 , indicating an increasing electron-withdrawing effect of these substituents from the acrolein backbone.In consequence, C β becomes more electrophilic.It can be seen that the lower the activation energy, the larger is the electron withdrawing effect of the substituents.Therefore, an electrophilic C β favors the Michael reaction [17,43].Our results allow us to suggest that 2-(2,4,6-tribromophenoxy)ethyl acrylate and 2,4,6-tribromophenyl acrylate are mutagenic.
The proposed descriptors are useful to understand the atomic property distribution in a set of molecules which share a common backbone.They also may be used in nonsupervised machine learning techniques, such as hierarchical clustering to predict the mutagenicity of potential mutagens.

Fig. 1
Fig. 1 Description of D descriptor for ethylene glycol dimethacrylate.This molecule has two acrolein backbones in black.Left: atoms in red belongs to substituents of the first acrolein backbone.Right: atoms in red belongs to substituents of the second acrolein backbone

Fig. 2
Fig. 2 Description of the D ∩ descriptor.Atoms in black belong to the acrolein backbone and in red belong to the substituent groups 4 top.Only the relation with the D descriptor is shown because D and D ∩ have the same values for mono(meth)acrylates (Table

Fig. 3 11 Fig. 4
Fig.3 Relation between the activation energy and the descriptor D. The values of D and D ∩ are equal for mono(meth)acrylates.Activation energies were taken from ref.[43].Equation for the adjusted line: y = − 29.376x + 37.242, R. 2 = 0.727

Fig. 5
Fig. 5 Dendrogram for the set of molecules studied herein and classified by the proposed descriptors.Cluster numbering is indicated by the colored numbers at the top of the figure

Table 1 (
Scheme 1 General structure of acrylates.X and Y denote substituent groups of the acrolein backbone

Table 2
Values of the descriptors D and D ∩ for each acrylate identified by its Id

Table 3
Atomic populations N(Ω) for all atoms in the acrolein backbone, the methoxy group and the substituent groups in X positionThe difference of N(Ω) between the anionic and the neutral molecule are shown in parentheses