Introduction

Solid solution models are an essential component in modelling geological, metallurgical and other chemical processes. The motivation behind this paper is to review some mathematical background defining the structure and properties of commonly used solution models, and to introduce tools that aid in their construction and manipulation. We concentrate on solution models which have a constant number of sites per formula unit, and where interactions are dependent only on the total proportions of species occupying each site. This type of solution was pioneered by Bragg and Williams as a way to model long range ordering in the 1930s braggspswilliamssps1934 (Bragg and Williams 1934, 1935; Williams 1935). Such models do not consider local interactions, such as bonding between pairs or clusters of species (Bethe 1935; Inden et al. 2001; Kikuchi and Masuda-Jindo 2002).

Many readers will be familiar with the key petrological concepts described in this paper. After all, the importance and consequences of solution chemistry and order-disorder reactions on mineral properties were already understood in the 1970s (Grover and Orville 1969; Thompson and James 1969; Wood and Nicholls 1978). However, mathematical descriptions of the models underlying the petrological concepts are usually explained briefly, or for only specific cases. In this paper, we provide a more complete description. We particularly emphasise the difference between the site-species occupancy space and compositional space of solid solutions, a distinction which is fundamental to modelling order-disorder processes in natural systems.

General descriptions of substitutional solid solutions

Site-species and solution constraints

Substitutional solid solutions can be written in the form:

$$\begin{aligned}{}[\mathrm{A, B}]^\mathrm{Y}_\mathrm{y}[\mathrm{B, C, D}]^\mathrm{Z}_\mathrm{z}\ldots \end{aligned}$$
(1)

where the square brackets denote distinct sites in the structure, and the comma-separated lists (A, B...) are the species that can occupy each site. These species can be elemental ions (e.g. Mg\(^{2+}\)), multielement species (e.g. OH\(^{-}\)) or vacancies (v). In the example above, species A and B can occupy Site Y, which occurs y-times per formula unit (i.e. the site has a multiplicity of y per formula unit), and species B, C and D can occupy Site Z, which occurs z-times per formula unit. We will refer to the species which can occupy a given site as site-species. There are five site-species in Formula 1: A\(^\text {Y}\), B\(^\text {Y}\), B\(^\text {Z}\), C\(^\text {Z}\) and D\(^\text {Z}\) (\(n_\text {site-species} = 5\)).

We denote the site-species occupancies of any instance of a solid solution by the vector \(\varvec{x}\), where each element \(x_i\) of the vector corresponds to the fractional occupancy of a species on a particular site. A simple one-site model of pyrope-almandine-grossular garnet [Mg,Fe,Ca]\(^\text {A}_{3}\) Al\(_{2}\) Si\(_{3}\)O\(_{12}\) is then uniquely described by a vector of length 3: [\(x_\text {MgA}\), \(x_\text {FeA}\), \(x_\text {CaA}\)]. The site-species occupancies of substitutional solid solutions must satisfy three types of constraints:

  1. 1.

    All of the site-species occupancies must be equal to or greater than zero, i.e.

    $$\begin{aligned} \sum x_i \ge 0 \end{aligned}$$
    (2)
  2. 2.

    The total fractional occupancy on each site must equal one, i.e.

    $$\begin{aligned} \sum x_{i \in S} = 1 \text { for all sites } S \end{aligned}$$
    (3)

    All solutions have \(n_\text {sites}\) of these constraints.

  3. 3.

    The composition must be charge balanced, i.e.

    $$\begin{aligned} \sum _i m_i c_i x_i = c_\text {total} \end{aligned}$$
    (4)

    where \(m_i\) and \(c_i\) are the site multiplicities and ionic charge of site-species i, and \(c_\text {total}\) is the total charge over all sites required to create a neutral solution. This constraint will already be satisfied by the site constraints if each site only contains species with a common charge.

A geometric visualization of solid solutions

The solid solution constraints given by Eqs. 24 are geometrically equivalent to a class of convex polytopes (the n-dimensional equivalent of a polyhedron). Every vertex of the polytope corresponds to an endmember of the solution with a different distribution of site-species.

The geometry of the polytope corresponding to a particular solution depends only on the number of sites, the number of site-species on each site, and the charge balance constraint.

Single-site solutions

Single-site solutions can be represented as simplexes, which are the n-dimensional equivalent of a triangle. A solution with two site-species can be represented by a line (a 1-simplex), a solution with three site-species by a triangle (a 2-simplex), and a solution with four site-species by a tetrahedron (a 3-simplex). Examples are shown in Fig. 1a, b.

Fig. 1
figure 1

Polytopes corresponding to some 1-site solution models. a A 2-simplex, such as [Mg,Fe,Ca]\(_{3}\) Al\(_{2}\) Si\(_{3}\)O\(_{12}\) garnet. b A 3-simplex, such as [Mg,Fe,Ca,Mn]\(_{3}\) Al\(_{2}\) Si\(_{3}\)O\(_{12}\) garnet. c A subset of a 2-simplex, such as a one-site pyrope-majorite solution Mg\(_{3}\)[Mg,Al,Si]\(_{2}\) Si\(_{3}\)O\(_{12}\), where a charge-balance constraint restricts the valid site-occupancy space to a line. Grey lines mark the original 2-simplex without charge-balance constraints

If the species on the site have different charges, the charge balance constraint is geometrically equivalent to cutting through the simplex with a hyperplane. For example, the charge balance constraint in one-site pyrope-majorite (Mg3[Mg,Al,Si]\(_{2}\) Si\(_{3}\)O\(_{12}\)) results in only two endmembers being stable; pyrope (Mg\(_{3}\)[Al]\(_{2}\) Si\(_{3}\)O\(_{12}\)) and disordered majorite (Mg\(_{3}\)[Mg\(_{0.5}\)Si\(_{0.5}\)]\(_{2}\) Si\(_{3}\)O\(_{12}\)), because the formulae Mg\(_{3}\)[Mg]\(_{2}\) Si\(_{3}\)O\(_{12}^{2-}\) and Mg\(_{3}\)[Si]\(_{2}\) Si\(_{3}\)O\(_{12}^{2+}\) are not neutral species. Graphically, this constraint corresponds to a line bisecting the [Mg]–[Al]–[Si] triangle (Fig. 1c).

Multisite solutions

Multisite solutions are geometrically equivalent to the Cartesian product of the individual site-simplexes. This mathematical jargon encapsulates the idea that there is an endmember vertex for every possible combination of site-species. Endmember vertices are connected by an edge if they differ by only one site-species. In Fig. 2a, b, polytopes are drawn for multisite solutions [A,B,C][D,E] and [A,B][C,D][E,F]. These might represent, for example, ([Ca,Fe,Mg][Fe,Mg]Si\(_{2}\)O\(_{6}\)) clinopyroxene (Grover and Orville 1969) and [Cu,Ag]\(_{10}\)[Fe,Zn]\(_{2}\)[Sb,As]\(_{4S13}\) fahlore Sack (2017) respectively.

Fig. 2
figure 2

Polytopes corresponding to some 2-site (a, c) and 3-site (b, d) solution models. a The Cartesian product of a 2-simplex and 1-simplex (a triangular prism), such as [Ca,Fe,Mg][Fe,Mg]Si\(_{2}\)O\(_{6}\)-clinopyroxene. b The Cartesian product of three 1-simplexes, e.g. [Cu,Ag]\(_{10}\)[Fe,Zn]\(_{2}\)[Sb,As]\(_{4}\)S\(_{13}\) fahlore Sack (2017). c A subset of the polytope in (a), which corresponds to solutions including [Fe,Mg,Al][Al,Si]O\(_{3}\)-bridgmanite. Grey lines mark the original polytope without charge-balance constraints. d A subset of the polytope in (b), which corresponds to solutions including [Mg,Si][Mg,Si][Mg,Si]O\(_{9}\)

As with single site solutions, if charge balance is not automatically satisfied by the site constraints, the additional constraint is equivalent to cutting the polytope with a hyperplane. For example, the triangle outlined by black lines in Fig. 2c could represent a two-site bridgmanite with the general formula

$$\begin{aligned}{}[\mathrm{Fe}^{2+},\mathrm{Mg}^{2+},\mathrm{Al}^{3+}][\mathrm{Al}^{3+},\mathrm{Si}^{4+}]\mathrm{O}_{3}. \end{aligned}$$

Each vertex of this triangle corresponds to an ordered endmember in the solid solution: {[Mg][Si], [Fe][Si], [Al][Al]}Footnote 1. To give a more extreme, if somewhat contrived, example, the hexagon outlined by black lines in Fig. 2d could correspond to a 3-site MgO-SiO\(_{2}\) oxide with the general formula

$$\begin{aligned}{}[\mathrm{Mg}^{2+},\mathrm{Si}^{4+}][\mathrm{Mg}^{2+},\mathrm{Si}^{4+}][\mathrm{Mg}^{2+},\mathrm{Si}^{4+}]\mathrm{O}_{9}. \end{aligned}$$

In this example, none of the original vertices satisfy the charge-balance constraints. The six endmembers are all partially disordered, having site-species occupancies of the form [Mg\(_{0.5}\)Si\(_{0.5}\)][Mg][Si]O\(_{9}\). This solution is fictional; we present it here to show that site-occupancy spaces can take on a wide variety of shapes.

Some solution model formalisms such as the compound energy formalism (Hillert 2001) express the excess energy of a solid solution in terms involving all of its endmembers, whether or not they are neutrally charged. Others express the excess energy in terms of the energies of an independent set of endmembers (Helffrich and Wood 1989; Holland and Powell 2003), where an “independent” set comprises the minimum number of endmembers that can span the entire site-species occupancy space of the system. For example, for the halide solution [Na, K][Cl, F], any three endmembers would constitute an independent set; if [Na][F], [K][Cl] and [Na][Cl] were chosen as the independent set, the distribution of species in the endmember [K][F] could be obtained by the linear sum [Na][F] + [K][Cl] − [Na][Cl], and therefore any excess energy defined in terms of site occupancies can be written in terms of three endmembers. We shall return to the relationship between independent and dependent endmembers, and to different formalisms for the excess energies of mixing in the following sections.

An algebraic description of solid solutions

The graphical representations of the previous section are a useful introduction to the site-species occupancy spaces of solution models, but it is also necessary to have a mathematical description that can be used in computations. In the following sections we shall outline just such a description, using vectors and matrices which are summarised in Table 1.

Table 1 Notation

First, we formalise the constraints in Eqs. 24 using set notation. The set \(\mathcal {X}\) of valid site-species occupancies \(\varvec{x}\) for any solution can be described using an \(n_\text {constraints} \cdot n_\text {site-species}\) matrix \(\varvec{P}\) and a vector \(\varvec{b}\) of length \(n_\text {constraints}\):

$$\begin{aligned} \mathcal {X} = \{\varvec{x}; \varvec{P}\varvec{x} = \varvec{b};\,\, x_i \ge 0 \} \end{aligned}$$
(5)

Each row of the matrix \(\varvec{P}\) and element of vector \(\varvec{b}\) corresponds to an independent equality constraint on the site populations.

We now present two examples of \(\varvec{P}\) and \(\varvec{b}\). First, consider the classic two site clinopyroxene in the CFMS system, given by the formula (Grover and Orville 1969):

$$\begin{aligned}{}[\mathrm{Ca,Fe,Mg}]^\mathrm{A}_1[\mathrm{Fe,Mg}]^{\text {binary}}_1\mathrm{Si}_{2}\mathrm{O}_{6}. \end{aligned}$$

Each of the two sites must be fully occupied, and the solution must be charge-balanced, which gives three equality constraints, which are tabulated in Table 2. Inspection of this table reveals that the charge balance constraint is a linear combination of the site constraints (2\(\cdot\)[row1] + 2\(\cdot\)[row2] = [row3]) and it is therefore not independent from the site constraints.

Table 2 Equality constraints for a two-site CFMS clinopyroxene, tabulated as a matrix \(\varvec{P}\) and vector \(\varvec{b}\) (Eq. 5)

Our second example is a one-site pyrope-majorite garnet:

$$\begin{aligned} \mathrm{Mg}_{3} [\mathrm{Mg,Al,Si}]^\mathrm{Y}_2 \mathrm{Si}_{3}\mathrm{O}_{12} \end{aligned}$$

The equality constraints for this solution are given in Table 3. Unlike the clinopyroxene example, the charge balance constraint is not a linear sum of the site constraints, and so it represents an independent constraint on the system.

Table 3 Equality constraints for a one-site pyrope-majorite garnet (of multiplicity 2), tabulated as a matrix \(\varvec{P}\) and vector \(\varvec{b}\) (Eq. 5)

Once the matrix \(\varvec{P}\) and vector \(\varvec{b}\) for a solid solution have been specified, standard algorithms can be used to determine the characteristics of the solution polytope which we represented graphically in “A geometric visualization of solid solutions”. Specifically, we can:

  • Determine the number of independent endmembers.

  • List the site-occupancies of all the endmembers.

  • Determine an independent set of endmembers.

  • Create a set of equality constraints (\(\varvec{P}\) and \(\varvec{b}\)) from an independent set of endmembers.

Determining the number of independent endmembers of a solution

Firstly, the number of independent endmembers can be obtained from the number of rows and columns of \(\varvec{P}\). For a linear system with a certain number of linearly-independent constraints (\(n_\text {constraints}\)) and unknowns \(n_\text {unknowns}\), there are \(n_\text {unknowns}-n_\text {constraints}\) degrees of freedom:

$$\begin{aligned} n_\text {dof} = n_\text {unknowns} - n_\text {constraints}. \end{aligned}$$
(6)

In this specific case, \(n_\text {unknowns}\) is the number of site-species, and \(n_\text {constraints}\) is the number of sites, plus one if the charge-balance constraint is linearly independent from the site constraints. Each degree of freedom can be considered to correspond to the proportions of an independent endmember. The total proportion of independent endmembers must sum to one, so there is one fewer degree of freedom than the number of independent endmembers. We can then write the following expression by substitution into Eq. 6:

$$\begin{aligned} n_\text {ind-mbrs} - 1 = n_\text {site-species} - (n_\text {sites} + c), \end{aligned}$$
(7)

where \(c=1\) if charge-balance constraint is linearly independent from the site constraints, and \(c=0\) otherwise. Therefore, the CFMS clinopyroxene example (Table 2) has four independent endmembers (\(n_\text {site-species}=5\), \(n_\text {sites}=2\), \(c=0\)), and the one-site pyrope-majorite has two independent endmembers (\(n_\text {site-species}=3\), \(n_\text {sites}=1\), \(c=1\)).

Listing endmember site-occupancies

The site-occupancies of all the endmembers of a solution which do not have an independent charge-balance constraint (\(c=0\)) can be tabulated by iterating through all the possible combinations of site-species. The resultant list is known mathematically as a cartesian product. The total number of endmembers in such a solution is given by:

$$\begin{aligned} n_\text {mbrs} = \prod _{S=1}^{n_\text {sites}} n_{\text {species,}S} \end{aligned}$$
(8)

where \(n_{\text {species,}S}\) is the number of species on Site S. For example, [Ca, Fe, Mg][Fe, Mg]Si\(_{2}\)O\(_{6}\) clinopyroxene (Table 2) has \(3 \cdot 2 = 6\) endmembers, which can be tabulated in an endmember site-occupancy matrix \(\mathbf {E}\) (Table 4).

Table 4 The endmember site-occupancy matrix \(\mathbf {E}\) for a two-site CFMS clinopyroxene

In solutions with charge-balance constraints, listing all of the valid endmembers is complicated by the fact that the charge-balance constraint can create disordered endmembers. For example, the one site pyrope-majorite (Table 3) has two valid endmembers, one of which is a disordered majorite, with both Mg and Si residing on the same site (Table 5).

Table 5 The endmember site-occupancy matrix \(\varvec{E}\) for a one-site pyrope-majorite garnet

In mathematical parlance, finding all the possible endmembers of a solution (determining \(\varvec{E}\)) is known as vertex-enumeration, and involves not only counting the vertices, but also determining their positions in space (Matheiss and Rubin 1980; Avis and Fukuda 1992; Lasserre 2004). In the python programs accompanying this paper, we generate matrices \(\varvec{E}\) from \(\varvec{P}\) and \(\varvec{b}\) (e.g. Tables 2, 3) using the package pycddlib (Fukuda and Prodon 1995). This package implements the double description method of Motzkin et al. (1953). This method takes as input a set of inequality constraints (equalities are represented as two inequality constraints), and uses these to compute all of the vertices of the polytope which is bounded by those inequalities. Mathematically-inclined readers are referred to the original paper for more information.

The solution polytope implementation in burnman (Appendix A) includes a function which generates a polytope from charge balance equalities. This function can be called in a single line, and generates an object with attributes such as the matrix \(\varvec{E}\).

Finding an independent basis set of endmembers

Once all the endmembers of a solution have been obtained, a set of independent endmembers and their site-occupancies \(\varvec{E}^{\text {ind}}\) can be computed in several ways. One way is to compute the row-reduced en-echelon form of \(\varvec{E}\). Row-reduction (also known as Gaussian elimination) reduces a matrix to upper triangular (en-echelon) form through a combination of row swapping, scalar multiplication of rows, and addition/subtraction of two rows. It is a standard technique to solve systems of linear equations. The endmembers corresponding to the non-zero rows of the en-echelon form of the matrix constitute an independent set. An implementation of this technique is provided in the python programs provided with this paper (Appendix A). The independent endmember set determined in this way depends on the initial ordering of the rows in \(\varvec{E}\) and the exact algorithm chosen for the row reduction (see Supplementary Information).

The relationship between independent endmember proportions \(\varvec{p}^{\text {ind}}\) and site-species occupancies \(\varvec{x}\) is given by the equation:

$$\begin{aligned} E^{\text {ind}}_{li}p^{\text {ind}}_l =x_i , \text { where } x_i \ge 0 \end{aligned}$$
(9)

In multisite solutions, valid site-species distributions can be obtained even if some elements of \(\varvec{p}^{\text {ind}}\) are negative.

Creating a set of equality constraints from an independent set of endmembers

Creators of thermodynamic models often start from a preferred basis set of independent endmembers for certain solid solutions. If one wishes to list all of the endmembers that can be described using this independent basis, they must first convert the basis into a set of equality and inequality constraints as given in Eq. 5 and then run those constraints through a vertex-enumeration routine. To see how to do this, let us consider a two-site bridgmanite solution, defined by the independent endmembers given in Table 6.

Table 6 The endmember site-occupancy matrix \(\varvec{E}\) for a two-site FMAS bridgmanite

We are looking for a set of independent equality constraints which are satisfied by all possible instances of this solution. A partial set is provided by the right nullspace (also known as the kernel) of \(\varvec{E}^{\text {ind}}\). The right nullspace \(\mathcal {N}(\varvec{E}^{\text {ind}})\) corresponds to the set of changes in site-species occupancies which cannot be achieved by changing the amounts of independent endmembers; i.e. the site-species occupancies of the solution must satisfy the expression

$$\begin{aligned} \mathcal {N}(\varvec{E}^{\text {ind}}) = \{\varvec{x};\,\, \varvec{E}^{\text {ind}}\varvec{x} = \varvec{0}\} \end{aligned}$$
(10)

For the bridgmanite solution given in Table 6, two potential basis vectors for the right nullspace are [\(0, 0, -1, 1, 0\)] and [\(-1, -1, 0, 0, 1\)]. These two vectors define compositions such that there is an equal amount of Al on each site and that the total Si on the B site is equal to the sum of Mg and Fe on the A site. In practise, the construction of the nullspace can be achieved by row reduction of \(\varvec{E}^{\text {ind}}\) followed by back-substitution (see supplementary information for a worked example).

An additional equality constraint fixes the total number of moles of the solution to be equal to one:

$$\begin{aligned} \sum x_i = n_\text {sites} \end{aligned}$$
(11)

As we did in “Site-species and solution constraints”, we can now tabulate these equality constraints. For the bridgmanite example, this construction leads to matrix \(\varvec{P}\) and \(\varvec{b}\) as given in Table 7.

Table 7 A matrix \(\varvec{P}\) and vector \(\varvec{b}\) for a two-site FMAS bridgmanite, as constructed from a set of independent endmembers

We now check that the equality constraints represented by this table are equivalent to those produced by the procedure in “Site-species and solution constraints”. The endmembers in Table 6 suggest a general formula [Mg,Fe,Al][Al,Si]O\(_{3}\), which generates Table 8. The rows of this table are just linear combinations of the rows of Table 7 (e.g. [row 1] in Table 7 is equal to ([row 3] − 4 [row 2] − 2 [row 1]) from Table 8), and therefore they represent the same equality constraints.

Table 8 Matrix \(\varvec{P}\) and vector \(\varvec{b}\) for a two-site FMAS bridgmanite

The composition space of a solution model

“Simple” solution models are defined as models where every point in site-occupancy space corresponds to a unique bulk composition. This is true of [Na\(^{+}\), K\(^{+}\)][Cl\(^{-}\), F\(^{-}\), I\(^{-}\)] halide, for example. Not all solution models are “simple”; in many models it is possible to redistribute site-species over sites without changing the bulk composition. In such solutions, which we call order-disorder solutions (also sometimes called “complex” solutions), the composition and site-occupancy spaces of the solution are not equivalent. For example, the following two instances of CFMS clinopyroxene have exactly the same bulk composition:

$$\begin{aligned}&{[\mathrm{Ca}_{0.6}\mathrm{Fe}_{0.1}\mathrm{Mg}_{0.3}][\mathrm{Fe}_{0.25}\mathrm{Mg}_{0.75}]\mathrm{Si}_{2}\mathrm{O}_{6}} \\&{[\mathrm{Ca}_{0.6}\mathrm{Fe}_{0.2}\mathrm{Mg}_{0.2}][\mathrm{Fe}_{0.15}\mathrm{Mg}_{0.85}]\mathrm{Si}_{2}\mathrm{O}_{6}} \end{aligned}$$

Site-species redistributions which satisfy the solution model constraints can be described by one or more isochemical reactions, bounded by the positivity constraints for each individual site-species (Eq. 2). A basis set of independent reactions can be found by first constructing the matrix of elemental compositions of each of the independent endmembers. We call this the stoichiometric matrix \(\varvec{S}^\text {ind}\). The nullspace of \(\varvec{S}^\text {ind}\) corresponds to the set of isochemical reactions. We define \(\varvec{R}^\text {ind}\) as any independent basis for this nullspace. The matrices \(\varvec{S}^\text {ind}\) and \(\varvec{R}^\text {ind}\) for CFMS clinopyroxene are given in Table 9.

Table 9 Composition and site-occupancy matrices for two-site CFMS clinopyroxene

The reaction vector \(R_\text {1}\) in Table 9 indicates that the bulk composition of the solid solution will not change if we substitute 2 moles of diopside and 1 mole of clinoferrosilite for 2 moles of hedenbergite and 1 mole of clinoenstatite. To see the redistribution of site-species implied by this reaction, we can take the dot product of \((\varvec{E}^\text {ind})^T\) with \(R_\text {1}\). The resulting vector \(R'_\text {1} = [0, -1, 1, 1, -1]\) indicates that the isochemical reaction is equivalent to the site-exchange reaction [Fe\(_{-1}\)Mg\(_{1}\)]\(^\text {A}\)[Fe\(_{1}\)Mg\(_{-1}\)]\(^{\text {binary}}\).

Many petrological studies choose to consider the compositional space of the solid solution, rather than the site-occupancy space, expressing the state of order via one or more order-parameters. Within the framework we have described, the compositional space can be obtained by projecting the site-species occupancy polytope onto a (hyper)plane perpendicular to the ordering vector(s). We illustrate this graphically for CFMS clinopyroxene in Fig. 3. The site-occupancy space of this solution is a triangular prism (Fig. 3a), and the isochemical ordering vector [Fe\(_{-1}\)Mg\(_{1}\)][Fe\(_{1}\)Mg\(_{-1}\)] is collinear with the cmf-cfm vector. Planes perpendicular to the ordering vector satisfy expressions of the form \(x_\text {FeB} + x_\text {MgA} - x_\text {FeA} - x_\text {MgB} = q_\text {1}\), where \(q_\text {1}\) is a scalar corresponding to the distance along the ordering vector. If we project the polytope onto any one of these planes, we obtain the classic “CFMS quadrilateral”, which defines the composition space of the solid solution (Fig. 3b).

Fig. 3
figure 3

a The CFMS clinopyroxene polytope. There are six endmembers of this polytope: diopside (di), hedenbergite (hed), clinoenstatite (cen) clinoferrosilite (cfs) and two ordered endmembers, (cfm and cmf). The purple ordering vector represents the isochemical exchange reaction Fe\(^\mathrm{B} +\) Mg\(^\mathrm{A} \longrightarrow\) Fe\(^\mathrm{A}\) + Mg\(^\mathrm{B}\). The blue plane is perpendicular to the ordering vector, and points on that plane satisfy the expression \(q_\text {1} = x_\text {FeB} + x_\text {MgA} - x_\text {FeA} - x_\text {MgB} = 0\). The dashed line represents where that plane intersects the bounding faces of the polytope. b Projection of the polytope onto a plane perpendicular to the ordering vector. The lighter triangle represents the bulk compositions for which the equation \(q_\text {1} = 0\) corresponds to a valid distribution of site-species. The contours represent the values of the ordering scalar \(q_\text {1}\) on the surface of maximum entropy, where (Mg:Fe)\(_\text {A}\) = (Mg:Fe)\(_\text {B}\)

Because the isochemical ordering vector \(R'_\text {1}\) is a linear combination of the independent endmembers di, hed, cen and cfs, it is possible to replace any one of those endmembers with the vector to make a new independent basis set (e.g. di, hed, cen, \(R'_\text {1}\)). If this is done, then one can describe any instance of the solid solution in terms of compositionally-independent endmember amounts and order parameter(s) (e.g., \(p_\text {di}\), \(p_\text {hed}\), \(p_\text {cen}\), \(q_\text {1}\)). The composition of the solution is determined by the endmember amounts, which must sum to one. However, the site occupancy space is only uniquely defined using all four parameters. For any bulk composition, the stable distribution of site-species is found by minimizing the Gibbs energy over all valid values of \(q_\text {1}\).

Within any order-disorder solution polytope, there is one surface of particular interest: the surface which maximises the configurational entropy \(\varvec{q}(\varvec{p})\). This surface is of interest for two reasons: it is continuous, differentiable and valid over all of composition space, and therefore represents a suitable starting guess for finding the stable configuration of site-species at fixed composition; and in the disordered (high-temperature) limit, this surface represents the stable configuration of site-species at all bulk compositions. In our clinopyroxene example, the maximum entropy surface corresponds to points within the solution space where the Fe:Mg ratio is equal on both sites. The values of \(q_\text {1}\) corresponding to the maximum entropy surface are shown as contours in Fig. 3b. This surface is given by the formula \(q_\text {1} = 2 n_\text {Ca} (n_\text {Fe} - n_\text {Mg})/((n_\text {Fe} + n_\text {Mg})(n_\text {Fe} + n_\text {Mg} + n_\text {Ca}))\), where \(n_i\) is the number of atoms of element i in the solution.

An illustrative example of the energetics of order-disorder in Bragg–Williams-type solid solutions

In order-disorder solution models, the stable arrangement of site species at any given bulk composition is determined by minimizing the Gibbs energy with respect to the isochemical reaction vectors. Order-disorder solutions can be assigned to one of three groups: (a) those that are completely disordered at all temperatures, (b) those that become completely disordered at high temperatures (convergent ordering), and (c) those that approach, but do not reach, a state of complete disorder at high temperatures (non-convergent ordering) (Thompson and James 1969). We illustrate these cases using a two-site FMS clinopyroxene model [Fe, Mg][Fe, Mg]Si\(_{2}\)O\(_{6}\) where the Gibbs energy \(\mathcal {G}\) of mixing is represented as a regular solution (Wohl 1946; Wood and Nicholls 1978; Powell and Holland 1993) between the endmembers clinoenstatite (cen), clinoferrosilite (cfs) and ordered Fe-Mg clinopyroxene (cfm). This energy of mixing is split into a non-configurational part (\(\mathcal {G}^*\)) and a configurational part (\(T S_\text {conf}\)):

$$\begin{aligned} \mathcal {G}&= \mathcal {G}^* - T S_\text {conf} \text {, where} \end{aligned}$$
(12)
$$\begin{aligned} S_\text {conf}&= -R \left( x_\text {FeA} \ln x_\text {FeA} + x_\text {MgA} \ln x_\text {MgA} \right. \nonumber \\&\quad + \left. x_\text {FeB} \ln x_\text {FeB} + x_\text {MgB} \ln x_\text {MgB} \right) \end{aligned}$$
(13)

T is the temperature, \(S_\text {conf}\) is the configurational entropy and R is the gas constant. The expression for \(S_\text {conf}\) ignores any contribution from short-range ordering. \(\mathcal {G}^*\) depends on interaction energies between endmembers (\(W_{ij}\)) and the Gibbs energy required to form 1 mole of endmember cfm from a mechanical mixture of clinoenstatite and clinoferrosilite (\({0.5 \text {cen} + 0.5 \text {cfs} \longrightarrow \text {cfm}}; \mathcal {G}^{*}_\text {cfm}\)):

$$\begin{aligned} \mathcal {G}^* = \sum _{i<j} W_{ij} p_i p_j + p_\text {cfm} \mathcal {G}^*_\text {cfm} \end{aligned}$$
(14)

\(\mathcal {G}^*_\text {cfm}\) accounts for the volume and vibrational entropy change of the reaction, but does not account for the change in configurational entropy, which in this case is zero because the cen, cfs and cfm endmembers all have only one species occupying each site.

Parameters which result in disordered, convergent and non-convergent models are provided in Table 10, and the corresponding non-configurational Gibbs energy surfaces are shown in Fig. 4. In the disordered model (Fig. 4a), the ordered endmembers cfm and cmf have higher non-configurational Gibbs energies than their disordered counterparts. The symmetry of the nonconfigurational Gibbs energy surface and the configurational entropy about the line of complete disorder means that the solution remains perfectly disordered at all temperatures. Maintaining the symmetry about the line of complete disorder, but making \(\mathcal {G}^*_\text {cfm}\) negative means that ordering is favoured at low temperatures. As temperature increases, the configurational entropy term in Eq. 12 causes progressive disordering; the dashed and dotted lines in Fig. 4b correspond to the equilibrium distributions of species at 400 and 500 K respectively. At high temperatures, the configurational entropy term dominates, resulting in complete disorder.

In reality, the two sites in clinopyroxene are not identical (Grover and Orville 1969; Holland et al. 2018), and therefore the excess energy is not symmetric about the line of disorder (Fig. 4c). The degree of disorder increases with increasing temperature, but partial ordering remains at all finite temperatures.

In the examples given here, the progress from order to disorder is monotonic with temperature at all bulk compositions. Furthermore, the value of the order vector [Fe\(_{-1}\)Mg\(_{1}\)][Fe\(_{1}\)Mg\(_{-1}\)] corresponding to the most stable arrangement of site-species always lies on the same side of the maximum entropy surface (the most stable configuration of species always lies in the upper left triangles in Fig. 4, regardless of bulk composition). Not all solutions behave in this way. For ordering reactions in which \(\mathcal {G}^*\) is a function of pressure and temperature, changes in conditions can potentially lead to inversions in the sign of the ordering vector (Redfern et al. 2000), and large non-ideal interaction energies can lead to islands of stability on both sides of the line of complete disorder/maximum entropy.

Table 10 Three mixing models for an FMS clinopyroxene, corresponding to those plotted in Fig. 4
Fig. 4
figure 4

Excess non-configurational Gibbs energy (\(\mathcal {G}\)*, J/mol) and configurational entropy (\(S_\text {conf}\), J/K/mol) for a two-site, two-species order-disorder clinopyroxene solution ([Fe, Mg][Fe, Mg]) modelled with three different symmetric solution models. Dotted black lines represent lines of constant composition. The solid black line represents equal amounts of Fe and Mg on both sites. Dashed orange and dotted red lines represent the equilibrium distributions of site-species at 400 and 500 K for each of the three models. a A model where the ordered phases [Fe][Mg] and [Mg][Fe] are destabilised by a large positive energy of ordering. Both sites have identical properties. b A model where the ordered phases are stabilised at low temperatures. Both sites have identical properties, so there is symmetry in the energetics of mixing on either side of the line of complete disorder. c A model where the ordered phases are again stabilised at low temperatures, but the two sites are now distinct, with an increased energy associated with Mg occupying Site 1. As a result, species Mg will always favour Site 2 (ordering is non-convergent). d Configurational entropy for all cases. See main text for further description

Constructing thermodynamically consistent solution models

If a solid solution model has fewer independent endmembers than required by Eq. 7, then there is at least one constraint that is not related to charge-balance. Sometimes, these choices are made consciously, with the aim of improving computational efficiency by reducing the number of order parameters. However, reducing the number of endmembers also restricts the valid site-occupancy space of the solution, that in turn will tend to artificially destabilise the solution. We advocate always using the full site-occupancy space.

As an illustration of how difficult it can be to assess the validity of order-disorder solution models without using the mathematical tools described here, we present an independent endmember basis set for clinoamphibole in the NCKFMASHTO system. This solution includes 18 site-species distributed over six sites (Green et al. 2016; Holland et al. 2018), with the general formula

$$\begin{aligned}&[\mathrm{v,Na,K}][\mathrm{Mg,Fe}]_{3}[\mathrm{Mg,Fe,Al,Fe}^{3+},\mathrm{Ti}]_{2} \\&[\mathrm{Ca,Mg,Fe,Na}]_{2}[\mathrm{Si,Al}]_{4}[\mathrm{OH,O}]_{2}\mathrm{Si}_{4}\mathrm{O}_{22} \end{aligned}$$

Using Eq. 7, a complete model in this system has 12 independent endmembers (Table 11, computed using the code accompanying this paper). The published model in this system (Green et al. 2016) includes only the first 11 endmembers in this set. Without using Eq. 7, it would be extremely difficult to know that the published solution model was incomplete. Even armed with the tools in this paper, it is difficult to quantify the effect that completing the basis set of endmembers has on the properties of the solution. Certainly, the smaller model fails to span the entirety of site-occupancy space; the full model has 436 dependent endmembers, while the published model has only 156. One difference is that in the subsystem

$$\begin{aligned} {[\mathrm{K,Na}][\mathrm{Mg,Fe}]_{3}[\mathrm{(Mg,Fe)}_{\frac{1+2x}{4}}\mathrm{Ti}_{\frac{3-2x}{4}}]_{2}[\mathrm{Na}]_{2} [\mathrm{Si}]_{4}[\mathrm{O}_{1-x}(\mathrm{OH})_{x}]_{2}\mathrm{Si}_{4}\mathrm{O}_{22}} \end{aligned}$$

the full model allows \(0 \le x \le 1\), whereas the partial model only allows \(x=0.5\). The partial model also excludes hydrogen-free, highly aluminous endmembers and hydrogen-free, ferric-iron bearing endmembers.

Table 11 Independent endmember site-occupancy matrix \(\varvec{E}^{\text {ind}}\) for a six-site clinoamphibole solution model including full order-disorder

We mention this clinoamphibole model because it is particularly complex, but is hardly unique among published models. The KFMASHTO biotite model published in Tajčmanová et al. (2009) is another solution that contains one fewer independent endmember than the number suggested by Eq. 7. The general formula is

$$\begin{aligned} {\mathrm{K[Mg,Fe,Al,Fe}^{3+}][\mathrm{Mg,Fe,Ti}]_{2}[\mathrm{Al,Si}]_{2}[\mathrm{OH,O}]_{2}\mathrm{Si}_{2}\mathrm{O}_{10}} \end{aligned}$$

whose site occupancy space can be spanned by seven endmembers (Table 12).

Table 12 Independent endmember site-occupancy matrix \(\varvec{E}^{\text {ind}}\) for a four-site biotite solution model including full order-disorder

As with the clinoamphibole example, the published biotite model couples titanium substitution with deficits in hydrogen content. In the subsystem

$$\begin{aligned} {\mathrm{K[Mg,Fe}][\mathrm{(Mg,Fe)}_{\frac{1+2x}{4}}\mathrm{Ti}_{\frac{3-2x}{4}}]_{2}[\mathrm{Al}_{1-x}\mathrm{Si}_{x}]_{2} [\mathrm{O}]_{2}\mathrm{Si}_{2}\mathrm{O}_{10}}. \end{aligned}$$

the full model allows \(0 \le x \le 1\), whereas the partial model only accepts \(x=0.5\). The partial model also excludes all 12 titanium-bearing endmembers in the subsystem

$$\begin{aligned} {\mathrm{K[Mg,Fe}][\mathrm{(Mg,Fe)}_{\frac{3-2(y-x)}{4}}\mathrm{Ti}_{\frac{1+2(y-x)}{4}}]_{2}[\mathrm{Al}_{1-y}\mathrm{Si}_{y}]_{2} [\mathrm{O}_{1-x}\mathrm{OH}_{x}]_{2}\mathrm{Si}_{2}\mathrm{O}_{10}}, \end{aligned}$$

and also the eight titanium-free endmembers in the subsystems

$$\begin{aligned} {\mathrm{K[Al,Fe}^{3+}][\mathrm{Mg,Fe}]_{2}[\mathrm{Si}]_{2}[\mathrm{O}]_{2}\mathrm{Si}_{2}\mathrm{O}_{10}}, \end{aligned}$$

and

$$\begin{aligned} {\mathrm{K[Mg,Fe}][\mathrm{Mg,Fe}]_{2}[\mathrm{Si}]_{2}[\mathrm{O}_{\frac{1}{2}}\mathrm{OH}_{\frac{1}{2}}]_{2}\mathrm{Si}_{2}\mathrm{O}_{10}}. \end{aligned}$$

Manipulation of solid solution models

Changing independent endmember bases

In this second part of this paper, we show how to convert endmember and interaction energies between sets (or bases) of independent endmembers. This conversion has two primary purposes:

  • We often wish to solve thermodynamic problems in restricted compositional spaces. For example, we might want to model the almandine-skiagite binary in garnets ([Fe\(^{2+}\)]\(_{3}\)[Fe\(^{3+}\), Al]\(_{2}\) Si\(_{3}\)O\(_{12}\)) (Woodland and O’Neill 1993) using a solution model where almandine ([Fe\(^{2+}\)]\(_{3}\)[Al]\(_{2}\) Si\(_{3}\)O\(_{12}\)), grossular ([Ca]\(_{3}\)[Al]\(_{2}\) Si\(_{3}\)O\(_{12}\)) and andradite ([Ca]\(_{3}\)[Fe\(^{3+}\)]\(_{2}\) Si\(_{3}\)O\(_{12}\)) are the independent endmembers.

  • We can use the same mathematics to convert interaction energies on the atomic scale (for example, the interaction between Fe\(^{2+}\) on one site and Fe\(^{3+}\) on another) into interactions between endmembers (for example, the interaction energy between almandine and andradite, see “Converting microscopic interactions into endmember interactions”).

Throughout this section, the matrix \(\varvec{A}\) is used to transform an independent endmember basis to a new basis. Each element of \(A_{ij}\) corresponds to the number of moles of original endmember i contained within the new endmember j. The proportions of the original endmember set \(p_i\) in terms of the new endmember set \(p'_l\) are thus given by:

$$\begin{aligned} p_i = A_{il}p'_l \end{aligned}$$
(15)

The following subsections outline the mathematics to convert endmember bases within the “modified van Laar” (Holland and Powell 2003) and “subregular” (Helffrich and Wood 1989) mixing model formulations. “Ideal”, “symmetric/regular” solution models can be viewed as special cases of these more general formalisms.

The modified van Laar model

The van Laar model (van Laar 1906) was reformulated by Holland and Powell (2003) for use as a generalised macroscopic asymmetric model. The general idea behind this formalism is that independent endmembers are associated with a parameter \(\alpha\) that skews non-ideal energies toward or away from the endmember. Greater values skew the energies increasingly toward the endmember.

The non-configurational Gibbs energy at any given distribution of site species is a function of the proportions of the independent set of endmembers \(p_i\):

$$\begin{aligned} G^*= &\, p_i G^{\text {mbr}}_i + \alpha _i p_i \alpha _j p_j W_{ij}^{\text {binary}}/f \end{aligned}$$
(16)
$$\begin{aligned} f= &\, \alpha _k p_k \end{aligned}$$
(17)

where \(G^{\text {mbr}}_i\) is the Gibbs energy of endmember i, \(\alpha _i\) is the van Laar (asymmetry) parameter for that endmember, the components of \(W_{ij}^{\text {binary}}\) are equal to \(2 W_{ij} / (\alpha _i + \alpha _j)\), and \(\varvec{W}\) is an upper triangular matrix containing the binary endmember interaction parameters.

Changing the set of independent endmembers for the asymmetric model has been described in Diener et al. (2007). Here, we provide an alternative derivation using Einstein summation convention. Repeated indices are summed over unless they appear on both sides of the equation.

To change the set of independent endmembers, first combine the endmember and interaction terms into a single matrix \(\varvec{W}^{\text {C}}\):

$$\begin{aligned} G^*= &\, \alpha _i p_i \alpha _j p_j W^{\text {C}}_{ij} / f \end{aligned}$$
(18)
$$\begin{aligned} W^{\text {C}}_{ij}= &\, W_{ij}^{\text {binary}} + ((G^{\text {mbr}}/\alpha )_i 1_j + 1_i (G^{\text {mbr}}/\alpha )_j)/2 \end{aligned}$$
(19)

This equation can be transformed into an expression involving the new endmember set by substituting p with \(p'\) (Eq. 15):

$$\begin{aligned} G^*= &\, \alpha _i A_{il}p'_l \alpha _j A_{jm}p'_m W_{ij}^{\text {C}}/f \end{aligned}$$
(20)
$$\begin{aligned} f= &\, \alpha _k A_{kn}p'_n \end{aligned}$$
(21)

We now seek to express this equation in the same form as Eq. 16. First, define new asymmetry parameters \(\varvec{\alpha }'\) and matrices \(\varvec{B}\) and \(\varvec{C}\):

$$\begin{aligned} \alpha '_l= &\, \alpha _i A_{il} \end{aligned}$$
(22)
$$\begin{aligned} B_{il} \alpha '_l= &\, C_{il} = \alpha _i A_{il} \end{aligned}$$
(23)

such that

$$\begin{aligned} B_{il} = \alpha _i A_{il} (1/\alpha ')_l \end{aligned}$$
(24)

Substituting these expressions into Eq. 21 yields

$$\begin{aligned} G^*= &\, B_{il} \alpha '_l p'_l B_{jm} \alpha '_m p'_m W_{ij}^{\text {C}}/f' \end{aligned}$$
(25)
$$\begin{aligned} f'= &\, \alpha '_n p'_n \end{aligned}$$
(26)

The interaction matrix can now be transformed to yield an expression in the form of Eq. 16:

$$\begin{aligned} G^*= &\, \alpha '_l p'_l \alpha '_m p'_m W'^{\text {C}}_{ij}/{f'} \end{aligned}$$
(27)
$$\begin{aligned} W'^{\text {C}}_{lm}= &\, B_{il} B_{jm} W^{\text {C}}_{ij} \end{aligned}$$
(28)

The transformed endmember properties are now removed from \(\varvec{W}'^{\text {C}}\). This is the reverse of the operation in Eq. 18:

$$\begin{aligned} G'^{\text {mbr}}_l= &\, \alpha '_l W'^{\text {C}}_{ll} \end{aligned}$$
(29)
$$\begin{aligned} D_{lm}= &\, W'^{\text {C}}_{lm} - \left( (G'^{\text {mbr}}/\alpha ')_l 1_m + 1_l (G'^{\text {mbr}}/\alpha ')_m\right) /2 \end{aligned}$$
(30)

Finally, the binary can be converted back to upper triangular form by summing the upper and lower triangular components of \(\varvec{D}\):

$$\begin{aligned} W'^{\text {binary}}_{lm} = {\left\{ \begin{array}{ll} D_{lm} + D_{ml},&{} \text {if } l < m\\ 0, &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
(31)

where the components of \(W'^{\text {binary}}_{lm}\) are equal to \(2 W'_{lm} / (\alpha '_l + \alpha '_m)\).

The subregular model

The subregular model (Andersen and Lindsley 1981) is another popular form of asymmetric solution model. Instead of defining asymmetries with a small number of parameters assigned to the endmembers, the subregular model is simply an extension of the regular solution model to cubic terms in the independent endmembers.

In the subregular model, the non-configurational Gibbs energy includes unary \(\varvec{G}^{\text {mbr}}\), binary \(\varvec{W}^{\text {binary}}\) and ternary \(\varvec{W}^{\text {ternary}}\) contributions (Helffrich and Wood 1989):

$$\begin{aligned} G^* = p_i G^{\text {mbr}}_i + p_i p_j W_{ij}^{\text {binary}} (1 + p_j - p_i)/2 + p_i p_j p_k W_{ijk}^{\text {ternary}}, \end{aligned}$$
(32)

where \(W_{ij}^{\text {binary}}\) has no diagonal elements and \(W_{ijk}^{\text {ternary}}\) only has non-zero elements for \(i<j<k\). For a simple binary solution, the excess energies are asymmetric iff \(W_{12}^{\text {binary}}\) is not equal to \(W_{21}^{\text {binary}}\).

To convert Eq. 32 into an expression involving a new independent set of endmembers, we first combine the various parameters into a single term involving a interaction matrix \(\varvec{W}^{\text {C}}\):

$$\begin{aligned} G^* = p_i p_j p_k W^{\text {C}}_{ijk}. \end{aligned}$$
(33)

The conversion is non-unique; one way to construct \(W^{\text {C}}_{ijk}\) is as follows:

$$\begin{aligned} \begin{aligned} W^{\text {C}}_{ijk}&= \left( G^{\text {mbr}}_i 1_j 1_k + G^{\text {mbr}}_j 1_i 1_k + G^{\text {mbr}}_k 1_i 1_j\right) /3 \\&\quad + \left( 1_iW_{jk}^{\text {binary}} + W_{ij}^{\text {binary}} I_{jk} - W_{ij}^{\text {binary}} I_{ik}\right) /2 + W_{ijk}^{\text {ternary}}. \end{aligned} \end{aligned}$$
(34)

Equation 33 can be transformed into an expression involving the new endmember proportions using Eq. 15:

$$\begin{aligned} G^*= &\, p'_l p'_m p'_n W'^{\text {C}}_{ijk} \end{aligned}$$
(35)
$$\begin{aligned} W'^{\text {C}}_{lmn}= &\, A_{il}A_{jm}A_{kn}W_{ijk}^{\text {C}}. \end{aligned}$$
(36)

To convert this back into the form of Eq. 32, we first remove the transformed endmember excesses from \(\varvec{W}'^{\text {C}}\):

$$\begin{aligned}&G'^{\text {mbr}}_l = W'^{\text {C}}_{lll} \end{aligned}$$
(37)
$$\begin{aligned}&B_{lmn} = W'^{\text {C}}_{lmn} - \left( G'^{\text {mbr}}_l 1_m 1_n + 1_l G'^{\text {mbr}}_m 1_n + 1_l 1_m G'^{\text {mbr}}_n\right) /3. \end{aligned}$$
(38)

The transformed binary matrix \(\varvec{W}'^{\text {binary}}\) is then given by

$$\begin{aligned} W'^{\text {binary}}_{lm} = (B_{lmn} + B_{mln} + B_{mnl}) I_{mn}. \end{aligned}$$
(39)

Removing the contribution of \(\varvec{W}'^{\text {binary}}\) from \(\varvec{B}\) leaves us with matrix \(\varvec{C}\):

$$\begin{aligned} C_{lmn} = B_{lmn}\left( 1 - I_{lm} - I_{mn} - I_{ln}\right) - \frac{W'^{\text {binary}}_{lm} \left( 1_n - I_{ln} - I_{mn}\right) }{2}. \end{aligned}$$
(40)

The transformed ternary components \(\varvec{W}'^{\text {ternary}}\) can then be found by summing the six contributing terms in \(\varvec{C}\):

$$\begin{aligned} W'^{\text {ternary}}_{lmn} = {\left\{ \begin{array}{ll} C_{lmn} + C_{mnl} + C_{nlm} + C_{nml} + C_{mln} + C_{lnm},&{} \text {if } l< m < n \\ 0, &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
(41)

In multi-site subregular models, asymmetric and ternary terms are both contributors to asymmetry in the energetics of mixing. Moreover, changing the independent endmember set can result in the appearance of non-zero ternary terms where none were present in the original set of interactions. The simplest example where this is the case is a two site model [A, B][C, D]. Let us arbitrarily consider AC, BC and BD to be the original endmember set, and replace BD with AD in the new endmember set, such that:

$$\begin{aligned} \varvec{A} = \begin{pmatrix} 1 &{} 0 &{} 0 \\ 0 &{} 1 &{} 0 \\ 1 &{} -1 &{} 1 \end{pmatrix} \end{aligned}$$
(42)

Let us say that we have information only on the AC-BC and BC-BD binaries, and that the AC-BC binary appears to be nearly ideal, while the BC-BD binary appears moderately asymmetric. We make the reasonable assumption that the two sites do not interact, so that \(W^{\text {binary}}_\mathrm{AC,BD} = W^{\text {binary}}_\mathrm{BC,BD}\) and \(W^{\text {binary}}_\mathrm{BD,AC} = W^{\text {binary}}_\mathrm{BD,BC}\). We naively assume that we can ignore ternary terms, and parameterise the model as follows:

$$\begin{aligned} \varvec{G}^{\text {mbr}}= &\, \left( G_\text {AC}, G_\text {BC}, G_\text {BD} \right) ,\nonumber \\ \varvec{W}^{\text {binary}}= &\, \begin{pmatrix} 0 &{} 0 &{} 2 \\ 0 &{} 0 &{} 2 \\ 4 &{} 4 &{} 0 \end{pmatrix}, W^{\text {ternary}}_\mathrm{AC,BC,BD} = 0 \end{aligned}$$
(43)

Transforming into our new endmember set [AC, BC, AD] yields the following

$$\begin{aligned} \varvec{G}'^{\text {mbr}}= &\, \left( G_\text {AC}, G_\text {BC}, G_\text {AC} - G_\text {BC} + G_\text {BD} + 2 \right) \text {, } \end{aligned}$$
(44)
$$\begin{aligned} \varvec{W}'^{\text {binary}}= &\, \begin{pmatrix} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} -4 \\ 2 &{} 2 &{} 0 \end{pmatrix} \text {, } W'^{\text {ternary}}_\mathrm{AC,BC,AD} = 2 \end{aligned}$$
(45)

The emergence of a non-zero ternary term in the transformed solution implies that there was no real justification for setting the ternary term equal to zero in the original formulation. Experimental and theoretical arguments for incorporating non-zero ternary terms have been discussed previously (Helffrich and Wood 1989; Cheng and Ganguly 1994), and our observation reinforces these arguments. However, to our knowledge, the implications of imposing a value of zero on ternary terms in multisite solid solutions has not been discussed. We can visualise the implications for our simple model by contouring the excess energy as a function of composition (Fig. 5). In the left hand panel of Fig. 5, we show the excess energy of mixing from the subregular model described above. As expected, the binary along the bottom (BC-AC) is ideal, and the binary along the left hand side (BC-BD) is asymmetric, with the energy skewed toward the BC endmember. However, the system as a whole is not symmetric. This might be surprising, given the assumptions we put into the model. The right hand panel of Fig. 5 illustrates the effect of increasing the ternary term \(W^{\text {ternary}}_\mathrm{AC,BC,BD}\) from to 2 kJ/mol. This modification yields what we might have expected from the original model; mixing on the first site is ideal, while mixing on the second site is non-ideal and mildly asymmetric.

Fig. 5
figure 5

Excess energy of mixing of the subregular solution models as described in “The subregular model”. Parameters for the left hand figure are as given in Eq. 43. The model parameters for the right hand figure are the same, except that the ternary term \(W^{\text {ternary}}_{AC,BC,BD}\) has been changed from 0 to 2 kJ/mol

For models such as this, it may not seem obvious how to choose appropriate ternary terms given the microscopically-motivated model assumptions. The solution to this problem is to express the Gibbs energy of mixing in terms of microscopic interactions, rather than interactions between endmembers. For the example above, we set the microscopic interaction parameters \(w^{\text {binary}}_{CD,\text {Site 2}} = 2\) kJ/mol and \(w^{\text {binary}}_{DC,\text {Site 2}} = 4\) kJ/mol, with all other interactions set to zero. The microscopic interaction parameters can then be converted into their macroscopic equivalents using the conversion procedure described in the following section.

Converting microscopic interactions into endmember interactions

The models presented so far are macroscopic models; they deal with the energetics of interaction between endmembers. Solution models can also be formulated in terms of atomic interactions (Sack and Ghiorso 1991, 1994; Pingfang et al. 1994). Microscopic models can be described by interaction matrices of the same form as their macroscopic counterparts, with the dimensions of the interaction matrices equal to the number of site-species, rather than the number of independent endmembers. The elements of the matrices correspond to the site-species interactions.

There are benefits to describing the properties of a solid solution in terms of microscopic interactions. Microscopic descriptions provide a much more direct link to the physics of the interactions (for example, order-disorder, Al-Al avoidance). They therefore neatly sidestep the problems encountered in the previous section. In addition, Powell et al. (2014) argue that well-constrained values of microscopic interactions in one mineral can be used as informed guesses for other minerals. For example, Fe-Mg exchange interaction has a value of around 5 kJ per mole of sites in many silicate minerals. This concept, termed micro-\(\phi\) (Powell et al. 2014), may be useful for constructing complicated models where there is insufficient data to fully constrain all interactions.

Despite the benefits of parameterising solutions using microscopic models, it is often practical to convert these to macroscopic models (Powell and Holland 1993). Macroscopic models require fewer parameters, automatically maintain charge balance, and the set of independent endmembers can be chosen to improve computational efficiency. The linear relationship between site-species proportions and endmember proportions (Eq. 9) has the same form as the relationship between sets of independent endmembers (Eq. 15). Therefore, the conversion from a microscopic to a macroscopic formalism requires only the mathematics of “Changing independent endmember bases”. This is true for both the asymmetric and subregular models, although previous work on micro-\(\phi\) focused only on symmetric interactions (Powell et al. 2014).

There are a couple of substitutions that need to be made to the mathematics of “Changing independent endmember bases” to convert microscopic interactions to macroscopic interactions. First, the untransformed endmember proportions \(\varvec{p}\) in Eq. 15 must be replaced by site-species proportions \(\varvec{x}\), and the matrix \(\varvec{A}\) must be replaced by \(\varvec{E}^{\text {ind}T}\). The endmember energies \(\varvec{G}'^{\text {mbr}}\) and (macroscopic) interaction matrices calculated using the asymmetric formalism equations of “The modified van Laar model” must be normalised to one mole of endmembers by multiplying them by \(n_{\text {sites}}\). Finally, the subregular equations require only that the \(1_i W^{\text {binary}}_{jk}\) term in Eq. 34 be divided by \(n_\text {sites}\).

The following subsections describe how to populate the binary and ternary terms in the microscopic interaction matrices. For worked examples, the reader is referred to the microphi package (see Appendix A).

Binary interaction parameters

The symmetric, modified van Laar and subregular formalisms all involve parameters describing the interactions between pairs of site-species. Two types of interactions are considered: simple mixing on a single site (e.g. Mg\(^{2+}\) and Al\(^{3+}\) on Site X), and two-site combinations of species (e.g. Mg\(^{2+}\) on Site X and Ca\(^{2+}\) on Site Y). Each entry in the binary interaction matrix corresponds to a Gibbs energy of formation of a cluster of sites from their constituents:

$$\begin{aligned} {[\mathrm{A}]^\mathrm{X} + [\mathrm{C}]^\mathrm{Y} \longrightarrow [\mathrm{A}]^\mathrm{X}[\mathrm{C}]^\mathrm{Y}}. \end{aligned}$$
(46)

For single-site mixing, the reaction can be rewritten as

$$\begin{aligned} {[\mathrm{A}]^\mathrm{X} + [\mathrm{B}]^\mathrm{X} \longrightarrow 2[\mathrm{A}_{0.5}\mathrm{B}_{0.5}]^\mathrm{X}}. \end{aligned}$$
(47)

Two-site (XY) energies (\(\epsilon _{[A]^X[C]^Y}\)) cannot be uniquely determined from experimental analyses. Instead, the energies corresponding to cross-site reactions are used to populate the matrix. For example, the reaction

$$\begin{aligned} {[\mathrm{A}]^\mathrm{X}[\mathrm{C}]^\mathrm{Y} + [\mathrm{B}]^\mathrm{X}[\mathrm{D}]^\mathrm{Y}\longrightarrow [\mathrm{B}]^\mathrm{X}[\mathrm{C}]^\mathrm{Y} + [\mathrm{A}]^\mathrm{X}[\mathrm{D}]^\mathrm{Y}} \end{aligned}$$
(48)

is associated with the cross-site interaction energy \(w_\mathrm{ACBD,XY}\), that is a function of the two-site energies (e.g. \(\epsilon _{[\mathrm{A}]^\mathrm{X}[\mathrm{C}]^\mathrm{Y}}\), abbreviated as \(\epsilon _\mathrm{AC}\)):

$$\begin{aligned} w_\mathrm{ACBD,XY} = (\epsilon _\mathrm{BC} + \epsilon _\mathrm{AD}) - (\epsilon _\mathrm{AC} + \epsilon _\mathrm{BD}). \end{aligned}$$
(49)

In each XY block of the microscopic interaction matrix, one component can be set to zero. For the example above, let us arbitrarily choose that element to be [B]\(^\text {X}\)[D]\(^\text {Y}\). All the remaining elements (such as [A]\(^\text {X}\)[C]\(^\text {Y}\)) of that block are filled with values corresponding to reaction with the excluded component \(-w_\mathrm{ACBD,XY}\). Note that interaction energies involving repeated site-species (e.g. \(w_\mathrm{BCBD,XY}\) and \(w_\mathrm{ADBD,XY}\)) are also set equal to zero, because the corresponding reactions have the same products and reactants (Powell and Holland 1993). A fully-worked and corrected symmetric example corresponding to the model discussed in Powell et al. (2014) is provided in the python package accompanying this paper (Appendix A), along with notes in Appendix B. A subregular example corresponding to the model described at the end of “The subregular model” (and in Fig. 5) is also provided.

Ternary interactions

The subregular model has ternary interaction parameters in addition to binary parameters. To construct a microscopic ternary matrix \(\varvec{w}^{\text {ternary}}\), we consider each component of the matrix to correspond to the energy of formation of a cluster of sites \(\epsilon _{[\mathrm{A}]^\mathrm{X}[\mathrm{C}]^\mathrm{Y}[\mathrm{E}]^\mathrm{Z}}\) from their constituents:

$$\begin{aligned} {[\mathrm{A}]^\mathrm{X} + [\mathrm{C}]^\mathrm{Y} + [\mathrm{E}]^\mathrm{Z} \longrightarrow [\mathrm{A}]^\mathrm{X}[\mathrm{C}]^\mathrm{Y}[\mathrm{E}]^\mathrm{Z}}. \end{aligned}$$
(50)

Mixing of species on the same site can be rewritten:

$$\begin{aligned} {[\mathrm{A}]^\mathrm{X} + [\mathrm{C}]^\mathrm{X} + [\mathrm{E}]^\mathrm{X} \longrightarrow 3[\mathrm{A}_{\frac{1}{3}}\mathrm{C}_{\frac{1}{3}}\mathrm{E}_{\frac{1}{3}}]^\mathrm{X}}. \end{aligned}$$
(51)

When mixing three site-species on two distinct sites, we have

$$\begin{aligned} {[\mathrm{A}]^\mathrm{X} + [\mathrm{C}]^\mathrm{X} + 2[\mathrm{E}]^\mathrm{Y} \longrightarrow 2[\mathrm{A}_{\frac{1}{2}}\mathrm{C}_{\frac{1}{2}}]^\mathrm{X}[\mathrm{E}]^\mathrm{Y}}. \end{aligned}$$
(52)

The energy of formation of two- and three-site complexes includes site-bonding terms that cannot be uniquely determined by experimental means. Similar to the case of binary two-site exchange, we can arbitrarily select a single “special” component in each XXY or XYZ block of the ternary matrix to be equal to zero (e.g. [A]\(^\text {X}\)[C]\(^\text {Y}\)[E]\(^\text {Z}\)), and populate the other components in that block using reactions involving the special component. Components are also set to zero if all but one of the corresponding site-species is the same as for the special component. Components sharing a single site-species with the special component are assigned values based on exchange reactions involving four clusters. For example, the component corresponding to the [B]\(^\text {X}\)[D]\(^\text {Y}\)[E]\(^\text {Z}\) cluster is assigned the negative of the following reaction energy:

$$\begin{aligned}{}[\mathrm{A}]^\mathrm{X}[\mathrm{C}]^\mathrm{Y}[\mathrm{E}]^\mathrm{Z} + [\mathrm{B}]^\mathrm{X}[\mathrm{D}]^\mathrm{Y}[\mathrm{E}]^\mathrm{Z} \longrightarrow [\mathrm{B}]^\mathrm{X}[\mathrm{C}]^\mathrm{Y}[\mathrm{E}]^\mathrm{Z} + [\mathrm{A}]^\mathrm{X}[\mathrm{D}]^\mathrm{Y}[\mathrm{E}]^\mathrm{Z}. \end{aligned}$$
(53)

Finally, components sharing no site-species with the special component are assigned values based on exchange reactions involving five clusters:

$$\begin{aligned} 2[\mathrm{A}]^\mathrm{X}[\mathrm{C}]^\mathrm{Y}[\mathrm{E}]^\mathrm{Z} + [\mathrm{B}]^\mathrm{X}[\mathrm{D}]^\mathrm{Y}[\mathrm{F}]^\mathrm{Z} \longrightarrow [\mathrm{B}]^\mathrm{X}[\mathrm{C}]^\mathrm{Y}[\mathrm{E}]^\mathrm{Z} + [\mathrm{A}]^\mathrm{X}[\mathrm{D}]^\mathrm{Y}[\mathrm{E}]^\mathrm{Z} + [\mathrm{A}]^\mathrm{X}[\mathrm{C}]^\mathrm{Y}[\mathrm{F}]^\mathrm{Z}. \end{aligned}$$
(54)

The number of non-zero (and independent) components in the XYZ blocks of the ternary matrix is therefore \(mno - m - n - o + 2\), where m, n and o are the number of potential site-species on the X, Y and Z sites. The XXY blocks have fewer independent endmembers (because [A\(_{0.5}\)B\(_{0.5}\)]\(^\text {X}\)[E]\(^\text {Y}\) is the same as [B\(_{0.5}\)A\(_{0.5}\)]\(^\text {X}\)[E]\(^\text {Y}\)), but are otherwise constructed in the same way.

Discussion

A rationalization for using asymmetric interaction terms in solid solution models

At the microscopic scale, asymmetry in the energetics of mixing in solid solutions implies that the excess bonding energy associated with dissolution of a small amount of endmember B into endmember A is lower than the excess bonding energy associated with dissolution of a small amount of A into B. Such energetic contributions are inherently local in nature, and local interactions are not explicitly treated by Bragg–Williams-type models.

So, to what extent can asymmetric interaction parameters approximate energetic effects due to local interactions? While short-range effects cannot be modelled effectively by Bragg–Williams models, we know that they can approximate long-range order by splitting the site on which ordering takes place (“An illustrative example of the energetics of order-disorder in Bragg–Williams-type solid solutions”). It turns out that a solution model that has >2 identical sites and symmetric interaction parameters can be reduced to a subregular model with a single site at high temperatures.

Take/consider as an example [Mg,Ca]\(^\text {3}\) Al\(_{2}\) Si\(_{3}\)O\(_{12}\) garnet, a solution that is believed to exhibit larger excess energies of mixing at Ca-rich compositions (Ganguly et al. 1996; White et al. 2014). If we split the Mg,Ca site into three sites, we can represent this solution with the general formula

$$\begin{aligned} {[\mathrm{Mg,Ca}]^\mathrm{A}[\mathrm{Mg,Ca}]^\mathrm{B}[\mathrm{Mg,Ca}]^\mathrm{C} \mathrm{Al}_{2} \mathrm{Si}_{3}\mathrm{O}_{12}}. \end{aligned}$$

This solution has eight distinct endmembers (the solution polytope is a cube). One could create an expression for the non-configurational Gibbs energy of mixing for this solution that only involves symmetric terms:

$$\begin{aligned} \begin{aligned} \mathcal {G}^* =&W_\text {Ca,Mg} (x_\text {CaA} + x_\text {CaB} + x_\text {CaC})(x_\text {MgA} + x_\text {MgB} + _\text {MgC}) \\&+ \sum _{i\in S^A} \sum _{j\in S^B} \sum _{k\in S^C} W_\text {ijk} x_\text {i}x_\text {j}x_\text {k} \end{aligned} \end{aligned}$$
(55)

where \(S^X\) is the set of species {Ca, Mg} on site X. Making the assumption that all three sites have identical mixing properties means that \(W_\text {CaMgMg} = W_\text {MgCaMg} = W_\text {MgMgCa}\) and \(W_\text {MgCaCa} = W_\text {CaMgCa} = W_\text {CaCaMg}\). At high temperature, this solution will become completely disordered, such that the free energy can be described in terms of the bulk proportions of Ca and Mg:

$$\begin{aligned} \mathcal {G}^* = 9W_\text {Ca,Mg} x_\text {Ca} x_\text {Mg} + 3 W_\text {MgCaCa} x_\text {Mg}x_\text {Ca}^2 + 3 W_\text {MgMgCa} x_\text {Ca}x_\text {Mg}^2 \end{aligned}$$
(56)

Note that this parameterisation is equivalent to a subregular mixing model; the model is asymmetric when \(W_\text {MgCaCa}\) is not equal to \(W_\text {CaMgMg}\). Using a one-site asymmetric model is therefore equally as valid as using the three-site convergent order-disorder model, as long as we are only interested in phase relations at temperatures above the disappearance of long range order (“An illustrative example of the energetics of order-disorder in Bragg–Williams-type solid solutions”).

Other solution model formalisms

Although we have restricted the discussion in this paper to the simplest of the asymmetric Bragg–Williams-type solution models, similar procedures can be applied to other formalisms. For example, the energetics of mixing in the compound energy formalism (Hillert 2001) can also be reformulated as a polynomial in site-occupancy space, and therefore expressed as a macroscopic model.