Correlating the Density and Refractive Index of Ternary Liquid Mixtures

Some physical properties of ideal solutions, e.g. the molar volume and the molar refraction, vary linearly with composition. Others can be expressed, either as ratios or as products of two other properties which vary with composition in this way. It is postulated that the non-ideal behaviour of real solutions can be adequately modelled by substituting these linear functions with higher order Scheffé polynomials. A suite of such models is presented for which the parameters are fully determined by knowledge of pure component properties and binary behaviour. Their binary data representation ability, and capacity to predict ternary properties, was tested using density and refractive index data for the acetic acid–ethanol-water ternary system as well as fourteen additional ternary data sets. Model performance was ranked on the basis of the Akaike Information criterion. With respect to predicting ternary density and refractive index behaviour from knowledge of binary data, it was found that lower-order models outperformed higher order models.


Introduction
The measurement of liquid composition is important in fields such as quantitative analysis, chemical processing, organic synthesis, medical diagnostics, and semiconductor manufacturing [1].Increasingly, chemical process control relies on in-line, real-time liquid chemical composition and temperature monitoring and adjustment to ensure optimal product yield.The composition of a liquid mixture is encoded in its physical properties, e.g. the density and the refractive index [2].This makes it possible, in principle, to use measurements of these properties to monitor the composition of a liquid solution.For example, the metrology of process fluids, via the measurement of the index of refraction, is a viable and superior means of real-time, in situ quality control in pharmaceutical manufacturing [3].For some applications, combining refractive index with a density measurement creates a simple yet even more powerful quality control technique.The reason is that such determinations are easy to implement and they can be fully automated [1][2][3][4].
Therefore, accurate mixture models are needed that describe the composition dependence of liquid density and refractive index.A recent review [5] considered predictive models linking the refractive index to the composition of binary liquid mixtures which included an alkane as one of the components.In this communication, an attempt is made to extend this analysis to a ternary mixture comprising associating compounds.The key findings of the previous study [5] underpin the analysis presented in this communication.

Theory
The molar volume is defined as the ratio of the molar mass to the density of a component: where V i , M i and ρ i are the molar volume, the molar mass and the density of component i at the temperature of interest.The ideal solution is a thermodynamic concept.It assumes that the interactions between like and unlike molecules are the same and that there is no change in volume of the system on mixing.The latter condition implies that the molar volume of an ideal solution (V) is a linear combination of the molar volumes (V i ) of the constituents.For a ternary mixture, the relationship is with the mole fractions (x i ) of the individual components in the mixture serving as the weighting factors.Equation 2 is a very good approximation for real liquid mixtures made up of non-associating components.The average molar mass of a mixture is also given by such a linear combination.Therefore, if Eq. 2 is valid, the density of a ternary mixture is given by: Note that this mixture model, for the density of an ideal solution, is defined by a Padé relationship.In this case it is the ratio of two first order (i.e.linear) polynomials.Furthermore, Eq. 3 implies that the mixture density is a weighted harmonic mean over the pure component densities with the mass fractions (w i ) serving as the weight factors: The advantage of Eq. 4 is that it can be used to estimate the density of a blend made up by combining different mixtures of unknown compositions [6].
The Lorentz-Lorenz parameter (N i ) is defined as follows [7]: (1) where n Di is the refractive index of component i.The molar refraction (R i ) is the product of the molar volume (V i ) and the Lorentz-Lorenz parameter (N i ): The molar refraction of a pure compound is virtually independent of the temperature [5,8].This is consistent with the so-called "one third rule", applicable to hydrocarbons, which states that R i ≈ M i /3 [8].In theory, the molar refraction of an ideal solution is also additive over the constituent components with weighing on a mole fraction basis.This was confirmed experimentally for many real liquid mixtures [5,9].For a ternary mixture, which is behaving like an ideal solution, it therefore holds that: Combining Eqs. 2, 6 and 7 yields the following ideal mixture model for the Lorentz-Lorenz parameter: Equation 8 is also a Padé-type expression.It is equivalent to the Lorentz-Lorenz R-mixing rule [5]: where φ i is the volume fraction of component i in the mixture.
The refractive index of the mixture can be calculated from the Lorentz-Lorenz parameter by re-arranging Eq. 5: It is possible to factor Eq. 7 into two components represented by Eq. 8 and Eq. 9.This yields an alternative mixture model for the molar refraction of an "ideal solution": Equation 11 expresses the molar refraction as a product of two linear polynomials.The first is a volume fraction-weighted arithmetic mean over the pure component N i values and the second is a mole fraction-weighted mean over the pure component molar volumes (V i ).
Excess functions are commonly used to model deviations from linear composition dependences.This approach is best illustrated using the molar volume of a binary system as the example.First note that one variable suffices to describe the composition of a binary mixture.Redlich and Kister [10] proposed the use of z 12 = x 1x 2 .This definition maps the composition range into the finite interval z 12 ∈ [− 1, 1].Next, an "excess function" is defined by subtracting the ideal behaviour from the actual function: The excess function is then fitted to experimental data for binary mixtures using a truncated McLaurin series expansion in the variable z 12 according to the following polynomial expansion [10] This approach apparently works very well when attention is limited to the correlation of binary mixture data.It is not straightforward to extend this idea to multicomponent mixtures.Many proposals have been made and they were comprehensively reviewed [11,12].Unfortunately, they are mostly limited to extensions that are valid for ternary mixtures at best.
This communication outlines a different approach for dealing with deviations from an expected "ideal solution" behaviour.The proposal considers the idea of replacing the linear polynomials used in the mixture models by higher order Scheffé K-polynomials.Compared to the conventional Scheffé S-polynomials [13], they offer attractive symmetry, compact notation and, additionally, they are homogeneous in the composition variables [14].Furthermore, the maximum eigenvalue for the information matrix for quadratic K-models is always smaller than that for the S-polynomials implying that the former are less prone to ill-conditioning [15].The Scheffé K-polynomials are, in essence, mathematically consistent mixture models based on ordinary Taylor polynomials that take the simplex constraint into account.For a ternary mixture this amounts to: x 1 + x 2 + x 3 = 1.The present proposal is illustrated using ternary mixtures as examples.However, the concept is applicable to mixtures containing any number of components.The aim was to develop mixture models that are fully capable of correlating highly non-linear composition dependence with a minimum number of parameters which, additionally, are fully determinable from binary data alone.
Towards these goals, it is constructive to first consider a mixture model for an "ideal solution" that takes the form of a linear blending rule.A good example is the molar refraction as given by Eq. 7. To account for deviations from this linear "ideal solution" model, higher order Scheffé K-polynomials could be considered as they should be able to provide better representation of actual experimental data.Lower order K-polynomials may suffice, for example, the second order (or quadratic) (K 2 ) Scheffé K-polynomial.For a ternary mixture it takes the form [14,16]: where [R ij ] is a matrix of binary coefficients and x = (x 1 , x 2 , x 3 ) is the vector defining the composition of the ternary mixture.Note that, mathematically, the second order Scheffé polynomial is equivalent to a quadratic form in the simplex domain.Equation 14 introduces adjustable interaction constants in order to improve the correspondence between actual experimental data and model predictions.
The third order (or cubic) (K 3 ) Scheffé K-polynomial is similarly defined [14,16]: In the cubic expression of Eq. 15, and in the nth-order polynomials, the adjustable interaction parameters R ijk are distinguished by the indicated set of ordered n indices of the form ijk…, with i < j < k < … indicating that that particular parameter is multiplied by the mole fractions corresponding to the component types i, j, k, … For example, 3R 112 is the coefficient for the term x 2  1 x 2 in a third order polynomial.The numerical value in front of the adjustable parameter is a binomial coefficient which arises from the number of different permutations possible of the indices.For example, there are three different possibilities 112, 121 and 211 for R 112 and therefore there is a "3" in front of this coefficient.In this (13) formalism, the number of indices correspond to the order of the polynomial in question and all of the R j = R jj = R jjj … represent the physical property value of pure component j.
An issue associated with the cubic and higher Scheffé polynomials, is the appearance of ternary and higher order interaction coefficients.In theory, this implies the need to have access to ternary or multicomponent data in order to assign values to these parameters.From a thermodynamic perspective, the behaviour of multicomponent mixtures is naturally affected by the interactions of unlike molecules.A common conjecture, made in chemical engineering, is that only binary interactions between the molecular species need to be considered in mixtures [17] [18].This means that it should in principle be possible to assign a value to the ternary constant on the basis of the parameters defining the behaviour of the constituent binary mixtures [11].The double weighted power mean mixture model (DWPM3) makes this actually possible [19].It includes, as a special case, a binary-predictive cubic expression which can be cast in the following form: This expression is equivalent to Eq. 15 with the link between the coefficients given by the general expression [19]: In this model defined by Eq. 17, the physical properties are encoded entirely in terms of the binary parameters r ij with R i = r 2  ii .Note that in that context, the ternary constant is fully expressed in terms of combinations of the binary coefficients, i.e. 3R 123 = r 12 r 13 + r 21 r 23 + r 31 r 32 .Two polynomials of the same order can only be equal to each other if all of the corresponding coefficients are identical.This means that, for each separate ternary subsystem present in a mixture, the corresponding ternary coefficient can be estimated from the corresponding binary information by performing data reduction using the model defined by Eq. 16.
The Scheffé polynomials have a very useful property of forming a nested set of equations [16].The overall ternary model is determined by the behaviour of the constituent binaries as demanded by Chou [11,12].Consider a ternary mixture in which the 1-2 binary follows the linear blending rule while the data trends for the 1-3 binary are adequately described by a quadratic Scheffé polynomial.If a cubic polynomial is necessary for the 2-3 binary, then the overall ternary behaviour is such that a cubic Scheffé K-polynomial must be implemented.The lower order polynomials, which provided adequate correlation performance for the other binaries, are readily converted into cubic forms without introducing additional parameters: The second order polynomial is promoted by multiplying it with unity, i.e. with x 1 + x 3 = 1, and the linear model by multiplying it with the square of this sum, i.e. with (x 1 + x 2 ) 2 = 1.The advantage of the nesting property is that the number of adjustable coefficients in the overall model can be kept to a minimum.In this way overfitting can be prevented.
Focke and Du Plessis [16] proposed Padé-style rational extensions of Scheffé K-polynomials.They defined Padé approximant-based mixture models in terms of the ratio of two homogeneous Scheffé K-polynomials.This idea is conveniently illustrated by considering higher order Padé approximants for the Lorentz-Lorenz parameter N. The expression for an "ideal" solution is given by Eq. 8.In order to account for nonlinear deviations from this model, the linear expressions in the numerator and denominators of Eq. 8 can be substituted with higher order Scheffé polynomials.For example, let N be defined by the ratio of a cubic Scheffé polynomial to a quadratic one: In this equation, P 3 2 indicates a Padé approximation corresponding to a cubic K-polynomial in the numerator (as defined by Eq. 16) and a quadratic one in the denominator; [r ij ] represents a set of binary interaction coefficients for the molar refraction; [V ij ] represents a matrix of binary coefficients for the molar volume, and x is the vector of mole fractions.The equation is a binary-predictive expression since the numerator polynomial is defined by Eq. 16 for which the ternary constant is defined by the binary interaction coefficients.
A similar stratagem can be used to increase the correlating power for a model that is expressed in terms of the product of two polynomials, e.g. the molar refraction R as defined by Eq. 11.The highest order considered presently for this property is: In summary, the present proposal for dealing with deviations from the composition behaviour of the physical properties expected for "ideal solution" mixture models, is to replace first order polynomials with second or third order Scheffé K-polynomials.This introduces adjustable parameters into the models that can be tweaked to improve agreement with actual data trends.On the other hand, there are also special parameter values that allow higher forms to be reduced into lower order forms [16].

Materials and Methods
All the chemicals were obtained from ACE chemicals, South Africa.According to the supplier, the purity of the acetic acid was 99.7% and that of the anhydrous ethanol was 99.9%.Deionized water was purchased from Merck.All chemicals were used as-received without further purification.
The refractive index values of the pure components were measured at 25 °C using a Mettler Toledo R4 instrument.The calibration of the instrument was checked using double distilled and deionized water.The instrument precision was ± 0.0001 and repeatability was ± 0.0002 refractive index units.Sample amounts were weighed out into glass bottles, sealed and allowed to stand for at least one day before characterization.Pipettes were used to place sample amounts into the instrument cell.
Density measurements were performed at 25 °C on an Anton Paar DSA 5000 M density and sound velocity meter.The procedure was as follows: The unit was set to the measurement temperature.It was then flushed by injecting approximately 5-10 mL of cleaning acetone into the feed entry port using a gas-tight syringe.The built-in pump was switched on and the syringe with acetone was removed from the feed entry port.The tubing leading from the pump was then connected to the feed entry port.This flushed out the liquid in the densitometer using compressed air.The pump was left to run for approximately one minute before being switched off and the tubing removed.This process was repeated three times in order to clean and remove any impurities in the unit.The U-tube was then checked visually to ensure no liquid bubbles remained in the device.Approximately 5-10 mL of the sample being measured was injected into the feed entry port via a syringe and thereafter flushed (18) out with the pump.This was repeated.Approximately 5-10 mL of the sample being measured was injected into the feed entry port via a gas tight syringe.The system was then allowed to reach thermal equilibrium.When the measurement was complete, the relevant readings were recorded.Each injection was measured at least three times to ensure that the data was repeatable.The cleaning procedure with acetone was then repeated.
The capacity of the various mixture models to represent experimental information was tested using the experimental density and refractive index data, obtained at 25 °C, for the highly non-ideal ternary system acetic acid-ethanol-water.This system was chosen because all three components are able to form association complexes through hydrogen bonding interactions with themselves, as well as with each other.Parameter values were determined using least-squares regression.The Akaike information criterion (AIC) [20] was used to rank the relative performance of the models.The governing equation is where n is the number of data points, SSE is the sum of the square errors, and df is the degrees of freedom, e.g. the number of parameters that are fitted.To compare models, it is only the difference between the AIC values that matters.On taking differences, the units cancel out and the result is unitless: In each case the "ideal solution" expressions, which did not feature any adjustable parameters, served as the reference models.Equation 20 makes intuitive sense as it balances the change in goodness-of-fit, as assessed by the sum-of-squares, with the change in the degrees of freedom (due to differences in the number of parameters in the models being compared).
Table 1 lists fourteen additional ternary systems for which density and refractive index values at 25 °C were published.These data sets were also subjected to analysis in order to verify the utility of the proposed mixture models.Table 2 lists the models considered for each of the three physical properties investigated, i.e. the density (ρ), the molar refraction (R), and the Lorentz-Lorenz parameter (N).First, the full data set for each system was regressed to test how well the models perform globally in terms of the Akaike information criteria.Next, only the binary data were used to fix the model parameters using least squares regression.The power to represent the binary data, and additionally to accurately predict the ternary results, were then quantified in terms of the ∆AIC measure.For each model, the latter were calculated using the SSE values corresponding to the predictions for both the binary and the ternary data.

Results and Discussion
The utility of the proposed mixture models was first tested using the density and refractive index data for the acetic acid-ethanol-water ternary system obtained at a temperature of 25 °C.Table 3 lists the presently measured density and refractive index data of the pure components, the binary and the ternary mixtures.Table 4 provides additional refractive index data.The actual data regression analysis was performed on an augmented data set which included published results from the sources listed in Table 1. Figure 1 shows plots of the full binary data sets considered for data regression for mixture density and refractive index.
Figure 1a shows that, in the acetic acid-water binary, the density reaches a maximum value of ca.1.065 at an equimolar composition.In the acetic acid-ethanol binary system, the density varies almost linearly with mole fraction.In contrast, the density of ethanol-water mixtures shows negative deviations from a mole fraction-based linear blending rule.Figure 1b shows that, for all three binaries, the refractive index reaches maximum values at intermediate compositions.Figure 2 shows plots of the experimental molar volumes and molar refractions.According to Eq. 2 and Eq. 7, the expectation is that these properties should vary linearly with mole fraction.A strong linear correlation is indeed evident.This indicates that, to a first approximation, the values for binary mixtures can be estimated from pure component properties.Taken as a whole over all the data values considered, the maximum absolute deviations from the linear relationships for the molar volume and the molar refraction, were just 3.6% and 3.2% respectively.The average absolute deviation was 1.8% for both properties.Figure 2b indicates that the acetic acid-ethanol binary does conform to the mass fraction weighted harmonic mean indicated by Eq. 4.However, the other two binaries show significant deviations from this "ideal solution" dependence.Figure 2d reveals that the Lorentz-Lorenz relationship of Eq. 9 does not hold at all for any of the three binaries.These pronounced deviations from expected "ideal solution" data trends necessitated the exploration of more complex mixture models.
Table 5 reports the results obtained for the different mixture models with respect to the ΔAIC values, the maximum absolute deviations (MAD) and the average absolute deviations (AAD) between experimental and predicted values.The mixture density (ρ) and the Lorentz-Lorenz parameter (N) were correlated using the proposed Padé extensions.The molar refraction was correlated using Scheffé polynomials and variations of Eq. 18. Table 5 also compares the results for data correlation using the full data set with those obtained when only the binary data was used.However, in both cases, the reported MAD and AAD values reflect the results for the full data set.The best-fit parameter values obtained on fitting the full data as presented in Table 6.
For the mixture density of the acetic acid -ethanol -water ternary, the highest ΔAIC values were obtained for the x) model.This was the case irrespective of whether the full data set or only the binary information was used to fix the model parameters.On applying this model to correlate the binary data alone, it was determined that MAD = 0.500% and AAD = 0.057%.Increasing the order of the denominator polynomial increased the number of adjustable parameters by three at a cost of reducing ΔAIC and an increase in both MAD and AAD.This deterioration was due to a significantly poorer prediction of the ternary density values.In contrast, for the Lorentz-Lorenz parameter, the x) model with twelve parameters proved best irrespective of whether the full data set or just the binary information was used to fix the adjustable model parameters.However, for the molar refraction, best performance was also achieved with a lower order model.The As before, the reason for this was that the latter two models performed worse at predicting ternary data from binary information.The implication is that it is possible for lower order Padé-based and polynomial product-based models to be better equipped at predicting ternary performance on the basis of binary data than higher order models.Figure 3 shows predicted mixture properties plotted against experimental results.In each case, the predictions of the model with the highest ΔAIC is shown.The parameter values for the best-fit models are presented in Table 7.
The proposed model approaches were further tested using the other, additional ternary systems listed in Table 1.Detailed results on model performance are presented in the Supplementary Material.The magnitude of the Akaike information criterion depends on the number of data points.Unfortunately, these differed for the various ternary data sets.This complicates global comparisons when attempting to determine which model performs best overall for each of the three property values considered.In an attempt to remedy this, it was  were calculated using the results obtained for the case where the model parameters were fixed on the basis of binary information.The results are presented in Table 7 in the form of ΔAIC/ΔAIC max values averaged per model over all fifteen data.Interestingly, in the 1 3 Supplementary Information, evidence is presented that models with more parameters in some cases perform worse at predicting ternary property values than the "ideal solution" relationships.It is clear from Table 7 that, at least for the data sets explored presently, that there exist mixture models that consistently outperform all others when the objective is to predict ternary property values using only the binary data to fix model parameters.The most effective mixture models, in that context, were found to be:   Density: Molar refraction: Lorentz-Lorenz parameter:  The values for the best performing model for each property is highlighted in bold The values for each ternary system were expressed as fractions of the larges ΔAIC value and averaged over the fifteen different data sets.
Only the binary data were used to fix the model parameters.

Conclusions
Published density and refractive index data, obtained at 25 °C, for the ternary system acetic acid-ethanol-water was augmented with additional measurements.This data set was used to test a suite of novel mixture models designed to predict multicomponent behaviour form knowledge of pure component properties and binary information.The proposed models were constructed using quadratic and/or cubic Scheffé K-polynomials as such, as products or as ratio's (i.e. as Padé approximants).Based on the Akaike information criterion, the maximum absolute deviation and the mean absolute deviation of predictions from actual values, the following models were found to best represent the experimental data for a set of fifteen ternary data sets: Density Ratio of a quadratic Scheffé K-polynomial in molar mass to a linear polynomial in molar volume.Lorentz-Lorenz parameter (N) as a transformation of the refractive index: Ratio of two quadratic Scheffé K-polynomials in molar volume with in molar refraction as numerator and molar volume as denominator polynomials.Molar refraction A cubic polynomial in molar volume defined by the expression: In this special cubic polynomial, the ternary constant is expressed in terms of binary parameters.The proposed models represent an alternative to the conventional excess property approach of correlating experimental mixture property data.

Fig. 1 Fig. 2
Fig.1 The variation of a the density, and b refractive index with composition for the binary mixtures as measured at 25 °C.Experimental values are plotted as triangles, circles, and lozenges.Open symbols represent literature data while filled symbols represent present measurements

Fig. 3 Table 7
Fig. 3 Plots of predicted vs. experimental values for a density; b molar refraction, and c the refractive index (via the Lorentz-Lorenz parameter N).In each case, the model parameters were determined using the binary data alone.The open symbols represent published data while the filled symbols represent present measurements

Table 1
List of data sources for density and refractive index for pure components, binary and ternary mixtures.All the data were obtained at a measurement temperature of 298.15K

Table 2
Mixture model expressions for density, molar refractive and the Lorentz-Lorenz parameter

Table 3
Density and refractive index data, obtained at 25 °C, for the ternary system acetic acid (1) -ethanol (2) -water (3) Akaike information criteria instead of just considering the actual values obtained.For each ternary data set, the ΔAIC obtained for each model was scaled using the highest ΔAIC value found.The primary objective was to explore the viability of predicting ternary behaviour from knowledge of binary data.Therefore the ΔAIC/ΔAIC max