In a recent paper published in this journal Yu et al. [1] reported the solubilities of hexaquocobalt(II) bis(p-toluenesulfonate) [Co(OTs)2·6H2O] in binary aqueous ethanol solvent mixtures. Solubilities were determined at ethanol mole fractions of 0–0.342 from 288.15 to 333.15 K using a synthetic method that involved adding incremental amounts of solid [Co(OTs)2·6H2O] until no more solid dissolved. The authors correlated their measured experimental data with the modified Apelblat equation [2], the Combined Nearly Ideal Binary Solvent Redlich–Kister (CNIBS/R–K) model [3] and a hybrid model that was a combination of the Jouyban–Acree [4] and modified Apelblat models. The purposes of the present commentary are to point out several problems in the authors’ mathematical correlations that significantly limit the correlations’ predictive ability and also to suggest some recommendations to avoid such problems in future works.

The author mathematically described the variation in the mole fraction solubility of [Co(OTs)2·6H2O] with binary solvent compositions in terms of following polynomial equation:

$$ \ln x_{1} = A_{1} + A_{2} x_{2}^{\text{B}} + A_{3} (x_{2}^{\text{B}} )^{2} + A_{4} (x_{2}^{\text{B}} )^{3} + A_{5} (x_{2}^{\text{B}} )^{4} $$
(1)

which was obtained from the CNIBS/R–K equation [3]:

$$ \ln \,x_{1} = x_{2}^{\text{B}} \ln \,(x_{1} )_{B,T} + x_{2}^{\text{C}} \ln \,(x_{1} )_{C,T} + x_{2}^{\text{B}} x_{2}^{\text{C}} \sum\limits_{i = 0}^{N} {S{}_{i}(x_{2}^{\text{B}} - x_{2}^{\text{C}} )^{i} } $$
(2)

by replacing the initial mole fraction composition of water in the binary solvent mixture, x C2 , with 1 − x B2 and then expanding the summation term for N = 2. Mole fraction solubilities of the binary mixtures and in the neat organic solvent are denoted as ln x 1 and as ln (x 1)B,T and ln (x 1)C,T , respectively. Numerical values of A i are obtained by regression analysis by curve-fitting the experimental mole fraction solubility data in accordance to Eq. 1.

The calculated curve-fit equation coefficients that the authors tabulated in Table 4 of their manuscript afford very little if any predictive ability outside of the range of solvent compositions studied. One of the goals for developing mathematical representations is to be able to make predictions at other solvent compositions and at other temperatures. To illustrate how limited the correlations reported by Yu et al. [1] are in terms of predictive ability, we have substituted the numerical values of the equations given in Table 4 for T/K = 288.15 into Eq. 1 above to obtain the following mathematical representation:

$$ \ln \,x_{1} = - 6.03505 - 2.17377x_{2}^{\text{B}} + 57.3962\,(x_{2}^{\text{B}} )^{2} - 204.227\,(x_{2}^{\text{B}} )^{3} + 219.075\,(x_{2}^{\text{B}} )^{4} $$
(3)

In the ethanol mono-solvent where x B2  = 1, Eq. 3 predicts a value of ln x 1 = 64.03538, which corresponds to a mole fraction solubility of x 1 = 6.46 × 1027. Mole fraction solubilities cannot exceed unity. There are similar problems with the calculated equation coefficients for the remaining 13 temperatures. All sets of equation coefficients give a calculated mole fraction solubility that exceeds unity in the ethanol mono-solvent.

Another mathematical representation that the authors used was the hybrid model, which was a combination of the Jouyban–Acree and Apelblat models. The hybrid model allows one to calculate the solubility as a function of both temperature and solvent composition:

$$ \ln x_{1} = B_{1} + \frac{{B_{2} }}{T} + B_{3} \ln T + B_{4} x_{2}^{\text{B}} + \frac{{B_{5} }}{T}x_{2}^{\text{B}} + \frac{{B_{6} }}{T}(x_{2}^{\text{B}} )^{2} + \frac{{B_{7} }}{T}(x_{2}^{\text{B}} )^{3} + \frac{{B_{8} }}{T}(x_{2}^{\text{B}} )^{4} + B_{9} x{}_{2}^{\text{B}} \ln T $$
(4)

where the calculated model coefficients are denoted as B 1 through B 9. To assess the predictive ability of Eq. 4, we have substituted into Eq. 4 the numerical values of the calculated equation coefficients:

$$ \begin{aligned} \ln x_{1} = & - 33.0191 + \frac{ - 982.164}{T} + 5.34387\ln T + 314.092x_{2}^{\text{B}} + \frac{ - 14226.9}{T}x_{2}^{\text{B}} + \frac{8392.44}{T}(x_{2}^{\text{B}} )^{2} \\ & + \frac{ - 40639.8}{T}(x_{2}^{\text{B}} )^{3} \, + \frac{48634.5}{T}(x_{2}^{\text{B}} )^{4} - 46.3451x{}_{2}^{\text{B}} \ln T \\ \end{aligned} $$
(5)

that the authors give in Table 5 of their manuscript. At T/K = 288.15 Eq. 5 becomes:

$$ \begin{aligned} \ln x_{1} = & - 33.0191 - 3.4085 + 30.2649 + 314.092\,x_{2}^{\text{B}} - 49.37324x_{2}^{\text{B}} + 29.1252\,(x_{2}^{\text{B}} )^{2} \\ & - 141.0369\,(x_{2}^{\text{B}} )^{3} \, + 168.7819\,\,(x_{2}^{\text{B}} )^{4} - 262.4746\,\,x{}_{2}^{\text{B}} \, \\ \end{aligned} $$
(6)

and predicts a value of x 1 = 9.92 × 1022 (ln x 1 = 52.9517) for the mole fraction solubility of [Co(OTs)2·6H2O], which is significantly different than the calculated value of x 1 = 6.46 × 1027 based on the polynomial version of the CNIBS/R–K equation.

The reason for the failure of the derived mathematical representations to provide realistic predictions for the solubility of [Co(OTs)2·6H2O] in ethanol is very simple. The authors’ experimental measurements were limited to a very small range of solvent composition, and there were no measurements performed in the ethanol rich binary solvent composition region. Derived mathematical representations cannot be expected to provide meaningful predictions at experimental conditions where no measurements were made. We suggest that in reporting experimental data that authors report mathematical representations–correlations only in those instances where the calculated equation coefficients do allow meaningful predictions over a wide range of solvent compositions. Publication of correlations that have a very limited range of application can lead to very erroneous predictions as illustrated in this commentary. The next point which one should consider is the significance of the model constants, i.e. equation coefficients, which is tested using t test and the constants with probability of more than 0.05 (or 0.10) should be ignored from the calculations, otherwise no improvement in the correlation or prediction capability of the model will be observed.

As an informational note, when the CNIBS/R–K equation was first proposed back in the early 1990s [3, 5,6,7], it was expected that researchers would perform solubility measurements in both mono-solvents to “anchor” the predictions in both solvents. The equation was subsequently used in polynomial form for mathematical convenience or when it was not experimentally feasible to measure the solubility in one of the co-solvents, perhaps because of an extremely low solute solubility, which is not the case for Yu et al. paper where by addition of ethanol, the solute solubility is increased. If there is any limitations in the resources or a researcher wishes to conduct a limited number of experiments, a well-structured design will be more helpful. We have tested collecting a minimum number of solvent compositions for experimental determination of the solubility data at, as an example, in 0.00, 0.30, 0.50, 0.70 and 1.00 fractions of the co-solvent, and obtained mathematical expressions that give reasonably accurate predictions [8,9,10]. Researchers in the field could do the same in future works. We are not suggesting that experimental measurements be performed at only the “bare minimum” number of solvent compositions. In our cited studies we performed measurements at several other binary solvent concentrations, which were then used to assess the predictive ability of our trained model based only a few experimental measurements.

We note that the CNIBS/R–K equation is now being used in its polynomial form to describe solubility in systems where measurements have been performed over a very limited binary concentration region. This type of application was never intended, because derived equation coefficients can lead to erroneous predictions, and furthermore the derived equation coefficients do not allow one to predict solubility in ternary and higher order solvent systems. There are numerous applications in the published literature where the calculated CNIBS/R–K equation coefficients for solute solubility in all contributing sub-binary solvent systems have been used to predict solubility in ternary [11,12,13,14] and quaternary [14] solvent systems. If researchers wish to report “predictive” mathematical correlations, then the experimental measurements must be designed with this objective in mind. Measurements need to be performed over a broad range of solvent composition, and if possible include both co-solvents.