Estimating antiwear properties of esters as potential lubricant-based oils using QSTR models with CoMFA and CoMSIA

Comparative molecular field analysis and comparative molecular similarity indices analysis were employed to analyze the antiwear properties of a series of 57 esters as potential lubricant-based oils. Predictive 3D-quantitative structure tribo-ability relationship models were established using the SYBYL multifit molecular alignment rule with a training set and a test set. The optimum models were all shown to be statistically significant with cross-validated coefficients q2 > 0.5 and conventional coefficients r2 > 0.9, indicating that the models are sufficiently reliable for activity prediction, and may be useful in the design of novel ester-based oils.


Introduction
As the main component in a lubricant formulation, the basic oil plays an important role in the friction process [1]. Thus far, various lubricant-based oils have been developed, such as mineral-based oils, biodegradable-ester-based oils, halogenated hydrocarbonbased oils, polyethylene (alkylene) glycol-based oil, polyether-based oil, and ionic liquids [2]. Among these, ester-based oils are widely used in environmentally friendly lubricants owing to their biodegradability.
In general, the choice of lubricant depends on conditions such as the operating temperature, rotating speed, load, and the environment. However, for a new specific operating system, a suitable lubricant may not be available, and it may be necessary to design a new one. In practice, the design process takes several design, synthesis, and testing iterations. Because of a lack of effective theoretical guidance, designing a new lubricant is difficult, and we usually need to conduct a large number of experiments.
Theoretically, however, all properties of a chemical compound are determined by its molecular structure. Clearly, these properties also include friction and antiwear properties. Therefore, in principle, it is feasible to predict the tribological properties of the compound from its structure. The quantitative structure activity relationship (QSAR) method provides a general framework to predict a specific property of a compound and in past decades, this method has achieved great success in the field of drug design [3−6]. 3D-QSAR methods take advantage of the 3D structures of molecules, and can often achieve a better performance [7−10]. Evidently, the QSAR method can also be applied to the study of a tribology, which we termed the quantitative structure tribo-ability relationship (QSTR) method.
In our previous works, we demonstrated the feasibility of this idea, and showed that the tribological properties can be simply predicted from 2D or 3D structural descriptors of the compound using a regression method or a neural network [11−13]. In our previous research, for poly alpha olefins (PAO) series lubricants, two 3D-QSTR methods, i.e., the comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) models, showed statistically better predictability than a 2D-QSTR method, i.e., the back propagation neural network (BPNN) model. In this work, a series of ester lubricant-based oils were studied, and the relationship between the tribological parameters and molecular conformation was investigated using CoMFA and CoMSIA.
The basic idea of CoMFA is that the difference in the particular properties of a series of molecules is derived from the difference in their shapes of the non-covalent fields, which can be measured using a probe atom. To make the shapes of the non-covalent fields comparable, the first and most important step of the CoMFA analysis is to align the 3D structures of these molecules properly to ensure that they have the same orientation. Starting from a series of superimposed 3D structures, a probe atom is used to measure the molecular interaction field of each structure, which can be represented as the interactions between the molecule and probe atom. The measurements are conducted at the lattice points of a regular Cartesian 3D grid, and the type of interactions can vary [14]. With CoMFA, only van der Waals and electrostatic interactions are considered [15,16]. With CoMSIA, more physicochemical interactions of the molecule are taken into account, such as the hydrophobicity, hydrogen bond donor, and hydrogen bond acceptor [17].
Once all molecular fields are calculated, a linear correlation between the properties and the fields are expected, and can be obtained through a partial least squares (PLS) analysis. The quality of the QSTR model can be evaluated using the cross-validated coefficient q 2 and conventional coefficient r 2 . Moreover, the relationships can be graphed as contour maps, which provide an intuitive insight regarding the properties and molecular fields. Accordingly, the 3D-QSTR method can not only predict the tribological properties accurately, but can also provide us clear guidance regarding a further molecular design.
In the remainder of this paper, we describe how 3D-QSTR models were constructed using CoMFA and CoMSIA methods to predict the antiwear performance of a series of ester based oils. Furthermore, guided by the 3D-QSTR models, we designed and synthesized two new ester compounds. Experiment results showed that the new compounds achieve a better antiwear performance, and indicate that 3D-QSTR methods have significant potential application in a lubricant design.

Dataset
Wear scar diameter data were obtained through a series of ball-disk contact friction tests using a microtribometer (UMT-3, CETR). To determine the antiwear capability of each molecule, the wear scar diameters were related through simple mathematical manipulations. 10

MW
In Eq. (1), WD represents the wear scar diameter scale, D is the measured size of the wear scar diameter, and MW is the molecular weight of the compound. For a compound, a higher WD value infers a smaller wear scar diameter (D), which means that the compound achieves a better antiwear performance.
A dataset containing 57 esters was used to construct the 3D-QSTR models. The dataset was randomly divided into two subsets: a training set containing 45 compounds (80% of all compounds), and a test set containing 12 compounds (20% of all compounds). The training set was used to develop the CoMFA and COMSIA models, and the test set was used to validate their accuracy. The compounds with the highest and lowest wear scar diameters were included in the training set.

Molecular modeling and alignment
The 3D chemical structures of all esters were built using ChemBioDraw Ultra 11.0. Subsequently, all work was conducted using the software TRIPOS SYBYL-X. The partial atomic charges were calculated using the Gasteiger Hückel method, and the geometry optimization was conducted using a MAXIMIN energy minimizer with a Tripos force field. The structures of the compounds are listed in Table S1 in the Electronic Supplementary Material (ESM).
To construct the 3D-QSAR models, all 3D structural structures first need to be properly aligned. This is also the most crucial step in the model development.
In the field of drug design, the compound with the highest biologically activity is generally chosen as the template molecule, and the native conformations of all other molecules are aligned into the native conformation of the template molecule, which is also the global minimum-energy conformation of the molecule. Correspondingly, compound A21 (see Table S1 in the ESM) was herein selected as the template molecule, which has the minimum wear scar diameter (D) for all 57 esters.
The native conformation of the template molecule was obtained through a combination of energy minimization and a molecular dynamics (MD) simulation. The MD simulation was carried out under a fixed temperature and pressure in a Tripos force field after energy minimization. The temperature was set to 300 K, the time step was set to 1.0 fs, and the conformations were saved every 5 fs. The pressure was determined using Eq. (2). 2 5 π 1.01 10 2 where P is the pressure on the molecule; F is the load under the experimental friction conditions, the value of which is 98 N; and D is the measured wear scar diameter. The simulation was run for 10,000 fs. The energy-time diagram of the template molecule could then be obtained, and the minimum energy conformation was saved as a template. All other geometry-optimized conformations of the molecules were aligned to the template using the ALIGN DATABASE command in SYBYL.

CoMFA and CoMSIA studies
After the molecular structures were properly aligned, we could conduct CoMFA and COMSIA analyses. In the CoMFA analysis, molecules were placed in a 3D lattice. The van der Waals and electrostatic interactions between the grid points and the probe atom were then calculated to construct steric and electrostatic fields. In our study, the grid spacing of the 3D lattice was set to 2 Å, and the probe atom was an sp3-hybridized carbon atom with a +1 charge. In the CoMSIA analysis, using the same lattice box and the probe atom, five physicochemical descriptors, namely, steric (S), electrostatic (E), hydrophobic (H), hydrogen-bond donor (D), and hydrogen-bond acceptor (A), were calculated.
3D-QSTR models could then be derived using a PLS analysis, which is an extension of multiple linear regressions. Concretely, a leave-one-out (LOO) crossvalidated method was used to evaluate the predictive capability and robustness of the model [10], and the cross-validation correlation coefficient (q 2 ) and optimum number of components (N) were reported. Once the optimal number of components was determined, the final model was generated using a non-cross-validated PLS analysis, and the squared correlation coefficient (r 2 ) values, standard error of the estimate (SEE), and the F-values were calculated. In the PLS analysis, CoMFA and CoMSIA descriptors were used as independent variables, and WD values were used as dependent variables. Additionally, we tried different combinations of the five CoMSIA descriptors. A model with higher q 2 and r 2 values was clearly better. We set q 2 > 0.5, r 2 > 0.6 as the criteria [18−21], and only the qualified models are summarized in Table 1. CoMFA and CoMSIA contour maps were defined using the StDev × Coeff mapping option contoured based on the contribution, which was set to 80% for the favored region and 20% for the disfavored region during the analysis. The criterion used for the decision was based on an analysis of the relationships between the tribology data and the CoMFA and CoMSIA descriptors.

CoMFA analysis
The results are summarized in Tables 1 and 2. The optimal number of components N was determined using PLS with a cross-validation. The high q 2 (0.726), r 2 (0.969), and F-values (308.664), along with a low SEE value (0.038), show the good statistical correlation and reasonable predictability of the CoMFA model. The steric and electrostatic field contributions of this model were 0.504 and 0.496, respectively, indicating that the importance of the steric field and that of the electrostatic field were almost the same. The actual and predicted values of WD and D for all compounds are listed in Table 2  The structural characteristics of the compounds and the contribution of the molecular interaction fields can be explained through the contour maps of the CoMFA models.
The steric contour map (Fig. 2) shows three green contours (80% contribution), which indicates that within these regions, bulky groups are favored in molecules with smaller D values. The regions encircling the green regions are the yellow regions (20% contribution), which indicate that the bulky groups near the end of the carbon chain are disfavored.
As shown in Fig. 2, compound A21 has three groups near the green contours, whereas compound A14 has no groups near them. These observations are in accordance with the experimental results: A21 is the compound with the lowest D value, and A14 is the compound with the highest D value. For compounds A15 and E06, compound A15 has two groups near the green contours, and compound E06 has three groups near the green contours and one group inserting the yellow contours, coinciding with their middle WD value.
CoMFA electrostatic contour maps are shown in Fig. 3. In the blue regions (80% contribution), the increased positive charges are in favor of the D values, and in the red regions (20% contribution) the increased negative charges are in favor of them. Two large blue regions are located in the middle of the contour map, which indicates that any positive charge or electron deficient substitute will enhance the D values at these positions. The two small red regions correspond to the positions where the electronegative groups enhance the D values. For the ester-based oils, most of the compounds have relatively low electronegative or electropositive fields, but the electrostatic field has a definite influence on them. For the two compounds,

CoMSIA analysis
As shown in Table 1, although the CoMFA and CoMSIA SE models are both built based on their steric and electrostatic properties, the calculations of their van der Waals and electrostatic interactions differ. Therefore, q 2 , r 2 , and the contributions of the steric and electrostatic fields are also different. The CoMSIA SE model has smaller q 2 and r 2 values, and the electrostatic field has a smaller contribution to the model. Similarly, the cross-validated q 2 of the SEHA and SHA models is 0.730 and 0.659, respectively, which indicates that the electrostatic field in CoMSIA has a significant influence on the establishment of the model. In general, however, the CoMSIA model can describe the structure-activity relationship more accurately and predictively, which is due to the fact that when CoMSIA descriptors are calculated, some of the inherent deficiencies arising from the functional form of the Lennard-Jones and Coulomb potentials can be avoided using a Gaussian function for the distance dependence between the probe atom and molecule [22]. For the SHDA model, although the cross-validated q 2 is slightly bigger than that of the SHA model, r 2 and the F-value are smaller, and the contribution of the hydrogen-bond donor in this model is only 0.055. Therefore, the hydrogen-bond donor can be considered an irrelevant factor in the establishment of the CoMSIA model.
After comparing the statistical indicators of the predicted results with different models, four descriptors, i.e., steric, electrostatic, hydrophobic, and hydrogenbond acceptor (SEHA) descriptors, were selected for building the CoMSIA model. The PLS analysis revealed a cross-validated q 2 of 0.730 with an optimum number of components of six, and a conventional r 2 of 0.979 with a standard error of 0.031. The corresponding field contributions of the steric, electrostatic, hydrophobic, and hydrogen-bond acceptor descriptors were 0.262, 0.229, 0.332, and 0.178, respectively. We also used the  results indicate that the proposed CoMSIA model developed using the steric, electrostatic, hydrophobic, and hydrogen-bond acceptor fields can predict the antiwear performance of the ester compounds very well. The results of the SEHA model are slightly better than those of the CoMFA model.
For the steric and electrostatic fields, the CoMFA and CoMSIA models have similar contour diagrams, which further illustrate that the 3D-QSTR models can be constructed using both CoMFA and CoMSIA methods.
In the CoMSIA model, the hydrophobicity and hydrogen-bond acceptor are two other important descriptors. As shown in Table 1, the hydrophobic field makes a considerable contribution to the CoMSIA QSTR model, which means that the hydrophobicity is an important descriptor of the antiwear properties. As shown in Fig. 5, compared with compound B13, compound B12 with a hydrophobic long carbon chain was superimposed on the CoMSIA hydrophobic contour map (white, favored; yellow, disfavored), which clearly shows that a longer carbon chain is disfavored with the D values.
As shown in Fig. 6, compounds A18 and E02 were superimposed on the CoMSIA hydrogen-bond acceptor  contour map (magenta, favored; red, disfavored). Clearly, the nitrogen atom of A18 is near the two magenta areas, and can accept a proton as a hydrogenbond acceptor. Therefore, A18 has a better antiwear performance benefitting from the contribution of the hydrogen-bond acceptor field in comparison with E02.

Design of new potential ester-based oils
According to the detailed contour analyses of both the CoMFA and CoMSIA models, different modifications can be attempted on the potential molecule to enhance the antiwear activity.
For example, two new compounds (2-ethylhexyl methacrylate and bis(methylglycol) phthalate) were introduced to estimate the prediction capability of the models. The structure and D values of the new compounds as compared with their respective reference compounds are presented in Table 2. As the table indicates, the models exhibit a good prediction performance because the relative errors of the actual and predicted D values are less than 5% for all the introduced compounds. 2-ethylhexyl methacrylate was generated by introducing a steric methyl group into compound A14 (2-ethylhexyl acrylate), and therefore has a better antiwear performance with a smaller wear scar diameter of 0.93 mm. Relative to dipentyl phthalate (compound E02), bis(methylglycol) phthalate was generated by replacing two carbon atoms of the carbon chains with oxygen atoms, which exhibited a better antiwear performance owing to the capability of accepting protons derived from the lone pair of electrons of the oxygen atom.
As future work, more compounds will be designed based on suggestions from the CoMSIA and CoMFA models, and a potential ester-based oil with excellent antiwear performance will be screened and synthesized.

Conclusions
The present study dealt with the 3D-QSTR of a series of ester compounds as lubricant-based oils. Robust CoMFA and CoMSIA models with a high predictive performance based on a PLS analysis were constructed. As compared to CoMFA, CoMSIA can provide a better statistical model with more precise contour maps, and possesses more prediction capabilities. Through an analysis of the model parameters and contour maps, the structure-activity relationship was presented in detail, and some useful information regarding a structural modification was provided. These models can also be used to make primary predictions for the antiwear performance of untested compounds.