1 Introduction

Cationic surfactants have attracted the attention of chemists for a long time due to their wide application in various fields of industry and medicine. Their utility results from, among other things, their antimicrobial activities [1,2,3] and anticorrosion properties [4, 5]. The medical applications of cationic surfactants include their use as antimicrobial agents, as drug delivery agents and also as gene delivery agents. Studies of the complexing and transfection of DNA have been presented in recent years [6,7,8,9,10]. The use of cationic surfactants continues to increase and new surfactants are constantly synthesized. Recently, cationic surfactants and their critical micelle concentrations were investigated by Quantitative Structure–Property Relationship (QSPR) studies [11,12,13,14,15,16,17].

Many properties of the surfactants that make them so useful are due to their unique structure. The common feature of all the surfactants is their amphiphilic nature. They contain both hydrophobic groups called tails and hydrophilic groups called heads. The tail in many cationic surfactants is a linear alkyl chain and the head group is a quaternary ammonium group.

The architecture of the hydrophilic head group is the factor which plays an important role in regulating physical/chemical properties [18,19,20] and biological activities [21,22,23] of many cationic surfactants. It is well known that the electrostatic repulsion of the surfactant’s ionic head groups counteracts micelle formation but some modifications of the structure of the head group can reduce the electrostatic repulsion of the head groups and facilitate aggregation. Experimental studies [19, 20] show that variations in the surfactants’ head group structure influence the values of critical micelle concentration and the changes of the cmc values caused by structure modification can be even larger than one order of magnitude.

Recently, the effect of the spacer group of cationic gemini surfactants on the critical micelle concentration using the molecular connectivity indices has been reported [24]. This effect was studied using gemini surfactants with fixed alkyl chains and head groups but with various spacer groups. Continuing these studies, the author has decided to examine the effect of the head groups on the critical micelle concentration of cationic (monomeric) surfactants using, as in the previous paper [24], only the molecular connectivity indices and also a combination of the molecular connectivity indices with atomic charges.

The main aim of the present work is to find simple formulae which allow the study, in particular, the effect of the head group structure modifications on the value of the critical micelle concentration. The formulae have been derived for compounds with the same alkyl chains and the same counterion, but with various head groups. To the author’s best knowledge, there was no paper presenting theoretical studies on the head group structure’s effect on the critical micelle concentration using different descriptors, in particular the topological indices.

2 Data and Methods

To determine the influence the surfactants head group structure on the value of critical micelle concentration the surfactants with various head group sizes and varied hydrophilicity of the head were taken into account. The data set contains 15 compounds of the training set (Fig. 1) and 4 compounds of the test set (Fig. 2). The compounds in the training set have a hexadecyl hydrocarbon tail and bromide as the counterion.

Fig. 1
figure 1

Structures of compounds from the training set

Fig. 2
figure 2

Structures of compounds from the test set

The first series of the training set (compounds 1–4) consists of compounds with quaternary ammonium head groups with three linear alkyl chains containing one to four carbon atoms attached to the nitrogen atom [20]. In the second series (compounds 5–7) the –CH3 groups of compound 1 are successively replaced by–CH2CH2OH groups [8]. The compounds of the third series (compounds 8–10) have cyclic head groups with the quaternary ammonium groups incorporated into rings in size from five to seven atoms [20]. The head group of the compound 11 is a pyridinium ring [25]. The replacement of the –OH groups of compounds from the second series by–OCH3 groups [19] gives the last series of the training set (compounds 13–15).

The test set compounds contains surfactants with primary, secondary, tertiary and quaternary ammonium head groups. To compare the effect of the variation in the head group structure on the values of critical micelle concentration, the experimental values of cmc, available in the literature, have been taken also (except compound t4) for compounds with a dodecyl hydrocarbon tail (R = C12H25) and bromide as counterion.

The chemical structures of the surfactants and the experimental values of cmc were taken from the literature [8, 19, 20, 25,26,27]. The experimental values of cmc of compounds from the training set are given in Table 1. The cmc values were measured in pure water [8, 19, 25] or in aqueous buffer of pH = 7.0 [20]. All cmc values were measured at 25.00 °C.

Table 1 Values of cmc of training set compounds

Some cmc values used in the calculations, which were measured in aqueous buffer solution (0.03 mmol·L−1 Tris buffer) of pH = 7.0 [20], were compared with values obtained in pure water [28] and found to be good agreement. The values taken from Ref. [28] are: 1.01 (mmol·L−1), 0.799 (mmol·L−1) and 0.585 (mmol·L−1) for compounds 1, 2 and 3, respectively.

In the figure below (Fig. 3) are presented the structures of additional surfactants, proposed by the author, which will be needed in the discussion.

Fig. 3
figure 3

Structures of additional compounds

The Kier and Hall [29] molecular connectivity indices were calculated based on the graphic structural formulae of the molecules using the expression shown in Appendix A. The atomic partial charges were calculated using the semi-empirical molecular orbital package MOPAC 7 included in the VEGA program [30], employing the semi-empirical Hamiltonian PM3. The statistical calculations were performed using the program STATISTICA 12 [31].

3 Results

The values of all descriptors for compounds from the training set are contained in Table 2.

Table 2 Values of the descriptors

As follows from Table 2, ten topological descriptors: five molecular connectivity indices from zeroth to fourth order, and five valence molecular connectivity indices also from zeroth to fourth order, were used in the search for the best model. In this study other descriptors namely the charge localized on the central nitrogen atom (\( Q_{{{\text{N}}^{ + } }} \)) and the total head group charge (\( Q_{\text{tot}} \)), were also taken into account. The total head group means the head group that contains α-methylene group, i.e. the first –CH2– group of the alkyl tail attached to the nitrogen atom.

Each formula expressing the relationship between the \( \log_{10} cmc \) and the descriptors was generated using the least-squares method. The final equation was obtained using the stepwise method. The quality of the derived formulae was tested using three statistical parameters: the correlation coefficient (\( r \)), the Fisher ratio (\( F \)) and the standard deviation (\( s \)). The best model was selected according to the following statistical principles: highest values of \( r \) and \( F \), lowest value of \( s \) and smallest possible number of significant descriptors in the model. Thus, the search for the best equation was stopped on three descriptors.

3.1 Model 1, Using Molecular Connectivity Indices

The correlation formulae which contain the molecular connectivity indices only were derived basing on data contained in Tables 1 and 2. In the case of Model 1 the search for the best equation consisted of three steps. The results of all correlations for Model 1 are presented in Table 3.

Table 3 Values of correlation coefficients in each step of Model 1

As follows from Table 3, the highest values of the correlation coefficients in the third step are for the relationships containing the \( {}^{0}\chi \), \( {}^{0}\chi^{\nu } \), and \( {}^{3}\chi_{c} \) indices and also for \( {}^{0}\chi \), \( {}^{0}\chi^{\nu } \), and \( {}^{3}\chi_{c}^{\nu } \). Because the \( {}^{3}\chi_{c} \), \( {}^{3}\chi_{c}^{\nu } \) indices highly correlate (\( r \) = 0.997), and also the \( {}^{3}\chi_{c} \) index encodes only information about the number of branches and their environment while the \( {}^{3}\chi_{c}^{\nu } \) index additionally encodes the information about heteroatoms, i.e., the relationship which contains the \( {}^{3}\chi_{c}^{\nu } \) index is richer in structural information [29], thus the relationship which contains the \( {}^{3}\chi_{c} \) index will be ignored in further considerations.

Thus, Model 1 is defined by the following molecular connectivity indices: the zeroth order molecular connectivity index (\( {}^{0}\chi \)), the zeroth order valence molecular connectivity index (\( {}^{0}\chi^{\nu } \)) and the cluster valence molecular connectivity index of third order (\( {}^{3}\chi_{c}^{\nu } \)). The final formula for Model 1 is the following:

$$ { \log }_{ 1 0} \, cmc\, = \, - \,(1.002\, \pm \,0.325)\, - \,(0.777\, \pm 0.060) \cdot {}^{0}\chi \, + \,(0.674\, \pm \,0.062) \cdot {}^{0}\chi^{\nu } \, - \,(0.436\, \pm \,0.107) \cdot {}^{3}\chi_{c}^{\nu } $$
(1)

The plot of the experimental values of \( \log_{10} cmc \) versus the values of \( \log_{10} cmc \) calculated using Eq. 1, along with the plot of residuals versus the experimental values of \( \log_{10} cmc \), are presented in Fig. 4.

Fig. 4
figure 4

Plot of the experimental \( \log_{10} cmc \) values versus the calculated using Eq. 1 for the training set (\( r \) = 0.974, \( F \) = 242.104, \( s \) = 0.102), and plot of residuals versus the experimental \( \log_{10} cmc \) values

3.2 Model 2 Using Molecular Connectivity Indices and Atomic Partial Charges

The correlation formulae which contain the molecular connectivity indices and the charges were derived based on data contained in Tables 1 and 2. The search for the best equation of Model 2 consists of three steps. The results of all correlations for Model 2 are presented in Table 4.

Table 4 Values of correlation coefficients in each step of Model 2

As shown in Table 4, two molecular connectivity indices define Model 2: the zeroth order molecular connectivity index (\( {}^{0}\chi \)) and the zeroth order valence molecular connectivity index (\( {}^{0}\chi^{\nu } \)). The third descriptor which defines Model 2 is the charge (\( Q_{{{\text{N}}^{ + } }} \)) localized on the central nitrogen atom. The final formula for Model 2 is the following:

$$ { \log }_{ 1 0} {\kern 1pt} cmc = (0.684\, \pm \,0.584)\, - \,(0.675\, \pm 0.053) \cdot {}^{0}\chi \, + \,(0.600\, \pm 0.056) \cdot {}^{0}\chi^{\nu } \, - \,(4.451\, \pm \,0.989) \cdot Q_{{{\text{N}}^{ + } }} $$
(2)

The plot of the experimental values of \( \log_{10} cmc \) versus the values of \( \log_{10} cmc \) calculated using Eq. 2, along with the plot of residuals versus the experimental values of \( \log_{10} cmc \), are presented in Fig. 5.

Fig. 5
figure 5

Plot of the experimental \( \log_{10} cmc \) values versus the calculated using Eq. 2 for the training set (\( r \) = 0.978, \( F \) = 284.876, \( s \) = 0.09), and plot of residuals versus the experimental \( \log_{10} cmc \) values

The values of \( \log_{10} cmc \) calculated using Models 1 and 2 and the experimental ones are given in Table 5 for the compounds of the training set and in Table 6 for the test set compounds.

Table 5 Calculated and literature values of \( \log_{10} cmc \) of compounds from the training set
Table 6 Calculated and literature values \( \log_{10} cmc \) of compounds t1–t4 from the test set and compound 1 from the training set

Models 1 and 2 were derived based on the training set compounds so all calculated values of \( \log_{10} cmc \) are for compounds with the hexadecyl hydrocarbon tail and bromide as counterion. To compare the effect of the variation in the head group structure on the cmc value, the experimental \( \log_{10} cmc \) values (those available in literature) contained in Table 6 have been taken (except for compound t4) for compounds with a dodecyl hydrocarbon tail (R = C12H25), but the calculated values using Models 1 and 2 are for compounds with the hexadecyl hydrocarbon tail (R = C16H33). Table 6 contains also the \( \log_{10} cmc \) value of compound 1 from the training set which will be helpful in the discussion.

The values of calculated \( \log_{10} cmc \) for compounds of the additional set are given in Table 7.

Table 7 Calculated values \( \log_{10} cmc \) of compounds from the additional set

4 Discussion

As follows from Table 5, the calculated values of \( \log_{10} cmc \) using Models 1 and 2 are very close to the experimental ones. Table 6 shows that Model 1 better predicts the \( \log_{10} cmc \) value of test compound t4 than Model 2, that means it may have better predictive ability. Perhaps this is due to the fact that the error of the additive parameter in Model 2 (Eq. 2) is only slightly lower than the value of the additive parameter itself, what makes that equation weaker in respect of the estimation capability. But the main aim of this work is to find the formula that allows to study the effect of the head group structure modifications on the cmc value, i.e. the formula which will show how the cmc values of cationic (monomeric) surfactants change with change in the head group structure, which means that the additive parameter is not so significant in these investigations and therefore Model 2 will be also taken into account.

An inspection of data contained in Table 5 reveals that, in agreement with the experiments, as the head group size increases the critical micelle concentration decreases. The slight decrease in the cmc value is then when with the increase of the head group size the hydrophobicity of the head group chains also increases (compounds 2–4 and 8–10). The exceptions are compounds 1 and 2 for which the calculated values of the cmc increase a little when the –CH3 groups are replaced by –CH2CH3 groups. A larger decrease in the cmc is then when the size of the head group increases by replacing all of the –CH3 groups by –CH2CH2OH groups. A comparison of the head groups which contain hydrocarbon chains and those which contain heteroatoms shows that replacement of the –CH2– group by –O– in the head group (compounds 4 and 15) causes a decrease the experimental and also the calculated using the Models 1 and 2 values of the cmc. Also, the experimental cmc values and those calculated using Models 1 and 2 are lower as the –CH3 group is replaced by an –OH group (compounds 3 and 7). As follows from Tables 5 and 7, the replacement the –CH2– group by a –NH– or –CH3 group by –NH2 (compounds 3 and A1, 4 and A2) causes a decrease the theoretical cmc values too. From the comparison of compounds 7 with A1 and compounds 15 with A2, it follows that the replacement the –OH group by –NH2, or –O– by the –NH– group causes an increase the cmc values calculated using Models 1 and 2.

Comparison of the compounds with straight and branched hydrocarbon chains and the same number of atoms in the head group (compounds 4 and A3) shows that the branches cause a decrease the calculated cmc values (Tables 5 and 7). In the case of calculated cmc values using Model 1, this decrease is significant: 0.293 mmol·L−1 and 0.078 mmol·L−1 for compounds 4 and A3, respectively. But in the case of cmc values calculated using Model 2, this decrease is slight: 0.299 mmol·L−1 and 0.285 mmol·L−1 for compounds 4 and A3, respectively. From Tables 5 and 7 it also follows that the comparison of compounds with straight and branched chains containing heteroatoms and the same number of atoms in the head group (compounds 15 and A4) shows that the presence of heteroatoms on branches, in comparison with those in straight chains, cause the decrease the calculated cmc values too. The comparison of the compounds A3–A5 (Table 7) shows that, similarly to compounds with straight chains in the head group, the replacement of the hydrophobic branch by a hydrophilic one, namely the –CH3 group by the –OH group, results in lower calculated using Models 1 and 2 values of the critical micelle concentration and, in the case of Model 2, the changes of cmc value are at least one order of magnitude.

An inspection of data contained in Table 6 shows also that the experimental values of critical micelle concentration of bromide dodecyl chain surfactants with primary, secondary, tertiary and quaternary ammonium head groups (compounds t1–t3, 1) increase with the increase of the head group order. For dodecyl chain surfactants with primary, secondary, tertiary and quaternary ammonium head groups the experimental cmc values are: 11.2 mmol·L−1, 11 mmol·L−1, 13.4 mmol·L−1 and 14.6 mmol·L−1 [27]. As follows from Table 6, the cmc values of the hexadecyl chain surfactants (compounds t1–t3, 1) calculated using Model 1 at first increase a little and next decrease with further increase of the head group order, but by using Model 2 the changes in the calculated cmc values for these compounds are in good agreement with the experimental ones.

Model 1 (Eq. 1) contains the zeroth order molecular connectivity index (\( {}^{0}\chi \)), the zeroth order valence molecular connectivity index (\( {}^{0}\chi^{\nu } \)) and the cluster valence molecular connectivity index of the third order (\( {}^{3}\chi_{c}^{\nu } \)), whereas Model 2 (Eq. 2) contains also the zeroth order molecular connectivity index (\( {}^{0}\chi \)) and the zeroth order valence molecular connectivity index (\( {}^{0}\chi^{\nu } \)) but along with the charge (\( Q_{{{\text{N}}^{ + } }} \)) localized on the central nitrogen atom. All the molecular connectivity indices were calculated based on the hydrogen-suppressed graphic structural formula of the molecule. The values of the \( {}^{0}\chi \) and \( {}^{0}\chi^{\nu } \) indices increase when the number of atoms in the head group increases but the values of the \( {}^{3}\chi_{c}^{\nu } \) index increase with the increase number of branches in the head group. The analysis of the variation of the values of nitrogen charges (\( Q_{{{\text{N}}^{ + } }} \)) in Table 2 shows that the charge localized on the quaternary nitrogen atom increases with the increase of size caused by elongation of the hydrocarbon chains (compounds 2–4), or it decreases with the increase of the number of –CH2CH2OCH3 groups in the head group (compounds 13–15). For the compound with a pyridinium ring (compound 11) the value of the nitrogen charge (\( Q_{{{\text{N}}^{ + } }} \)) differs significantly from the others. In the remaining cases (compounds 5–7, 8–10) the increase in size of the head group does not cause any significant changes in the value of \( Q_{{{\text{N}}^{ + } }} \).

The cluster valence molecular connectivity index of the third order (\( {}^{3}\chi_{c}^{\nu } \)) appearing in Model 1 represents three-bond cluster terms within the molecule and it differentiates heteroatoms [29]. Thus the \( {}^{3}\chi_{c}^{\nu } \) index is calculated from that part of the molecules of the training set which includes the quaternary nitrogen atom, and only the changes in the nearest environment of that atom influence the value of the \( {}^{3}\chi_{c}^{\nu } \) index for most investigated compounds. Whereas the charge on the central nitrogen atom depends on all atoms in the head group. Maybe therefore the inclusion of the the nitrogen atom charge (\( Q_{{{\text{N}}^{ + } }} \)) in the correlation analysis gives better description capability. In general, Model 1 is very good in describing mainly compounds with quaternary ammonium head groups, namely those compounds which were used in the correlation analysis and probably the inclusion to the correlation analysis of the compounds from the test set could give a model a better descriptive capability.

The results obtained for cationic monomeric surfactants, the experimental and the presented theoretical are consistent with those obtained for gemini surfactants. For example, when the head group size of the gemini surfactants increases with successive replacement the methyl groups by hydroxyethyl groups, the cmc values decrease [32]. Also, when the hydrocarbon parts of the head group of gemini surfactants increase from methyl to butyl, the cmc values decrease too [33].

It is worth to note that for the training set compounds, i.e. the compounds with a quaternary ammonium head group, the charges calculated on the total head group are approximately equal to 0.9 (Table 2). As was described in the “Results” section, the total head group means the head group which contains an α-methylene group. Recent calculations of the charge on the α-methylene group for different ionic surfactants show that the charge on that group is significant [34, 35] and often this group is hydrated [36]; this suggests that the α-methylene group should be considered as a part of the head group [34]. Thus, assuming that the α-methylene group is included in the head group, the charges of the head groups are about 0.9 regardless of whether that group is more or less hydrophilic, smaller or larger. Similar values (0.885 and 0.891) were obtained by Huibers [34] for trimethylammonium and pyridinium surfactants and also by Zhao et al. [35] for trimethylammonium and tripropylammonium surfactants (0.893, 0.899), using also PM3 calculations. This means that the changes of the head group structure do not affect significantly the value of the quaternary ammonium head group charge, while the critical micelle concentration in some cases changes significantly. The values of the total head group charges (\( Q_{\text{tot}} \)) included in Table 2 show that only for compounds 2–4 there is a slight decrease of the head group charge along with the increase the length of the hydrophobic head chains.

The paper [35] reported a study of the ammonium salts with methyl to butyl head chains (\( {\text{C}}_{ 1 2} {\text{NMe}}_{3}^{ + } \), \( {\text{C}}_{ 1 2} {\text{NEt}}_{3}^{ + } \), \( {\text{C}}_{ 1 2} {\text{NPr}}_{3}^{ + } \), \( {\text{C}}_{ 1 2} {\text{NBu}}_{3}^{ + } \)) and also different alkyl tail lengths (\( {\text{C}}_{ 1 0} {\text{NMe}}_{3}^{3 + } \), \( {\text{C}}_{ 1 2} {\text{NMe}}_{3}^{ + } \)). The authors of this paper noted that the cmc values change though these cationics surfactants have nearly the same charge on the alkyl tail. Although the authors of paper [35] do not write directly about the head groups, this observation is consistent with results of the present work obtained for a larger number of compounds and for compounds with more diverse head groups. Also, in paper [35] it was suggested that, together with the α-methylene also the second to fifth –CH2– groups of the alkyl tail also may be considered as a part of the head group. In this case the values of the head group charges are of course larger but still they are approximately the same. Thus, also in this case, the changes in the head group structure do not affect significantly the charge of the quaternary ammonium head group.

5 Conclusions

The present theoretical QSPR study confirms the experimental results that the structure and nature of the head group influence the cmc value.

The presence of heteroatoms in the head group of cationic surfactants causes a decrease of the critical micelle concentration. Also, an increase in the number of branches in the head group causes a decrease of the cmc values. The –OH and –O– groups reduce the cmc value more than the –NH2 and –NH– groups and the –OH group more than the –O– group. Considering the effect of the head group on the value of the critical micelle concentration, the results obtained suggest that the most active surfactants would be those that contain many –OH groups in the head group, for example the A5 compound proposed in the Sect. 2.

The present investigations of the atomic partial charges of the head groups show that, in general, the head group charges of different quaternary ammonium salts are approximately the same. The structures of the surfactants are different and the cmc values also differ but the total head group charges (\( Q_{\text{tot}} \)) are approximately equal to 0.9. This means that the changes in the head group structure do not significantly change the charge value of the quaternary ammonium head group while the critical micelle concentration changes and in some cases even considerably.

The models obtained will be helpful in designing novel cationic surfactants that are more active in micelle formation.