A derivation of the Grand Canonical Partition Function for systems with a finite number of binding sites using a Markov chain model for the dynamics of single molecules
 1.2k Downloads
 2 Citations
Abstract
We use a Markov chain to model the ligand binding dynamics of a single molecule and show that its stationary distribution coincides with the laws of the Grand Canonical Ensemble. This way of deriving the equilibrium laws has the following advantages: Firstly, the derivation is short and does not require the knowledge of the Microcanonical, Canonical or Grand Canonical Ensemble. Secondly, it provides a descriptive interpretation of the factors that contribute to the probability of a microstate. In this regard, it also shows that the chemical activity, which cannot be regarded as a probability (since it is not necessarily bounded by one), can be interpreted as the ratio of two probabilities. Thirdly, our approach allows modeling how the system reaches equilibrium. This can be a useful tool for the study of nonequilibrium states.
Keywords
Decoupled sites representation Ligand binding Binding polynomial Grand Canonical Partition Function Binding energy Markov chain Binding dynamics1 Introduction
We consider the following situation: A target molecule M has n binding sites for substance L. A certain amount of both substances is solvated in a liquidity at a much higher concentration of ligand L than of M, and the number of free ligand molecules can be measured. Thus, the difference between the number of free and the total number of ligand molecules allows us to determine the average number of ligand molecules L bound to a single target molecule M as a function of the concentration (or chemical activity) of L. Experiments of this kind are a classical procedure in chemistry and produce titration curves that characterize the overall binding of L to M. Titration curves for protons binding to aminoacids can be found in nearly every biochemistry textbook and have been studied for 100 years [2, 4, 7, 8, 13]. The mathematical model for titration curves is based on the binding polynomial (bp). It is a function of the chemical activity of the ligand and derived as a special case of the Grand Canonical Partition Function (GCPF), if molecule M is regarded as a system that can take up a finite number n of particles [1, 12, 14]. Its origin in statistical mechanics reemphasizes that it characterizes stochastic properties of a system: It defines a family of distributions over the number of bound ligands, which is parameterized by the chemical activity of the ligand (the temperature is fixed). The titration curve, which is the result of the previously described experiments, is derived by applying the expectation operator to the parameterized family of distributions. However, the GCPF describes only the thermodynamic equilibrium, a steady state of a system consisting of a large number of molecules, in which every single molecule follows its own dynamics of releasing and binding ligands. Thus, it seems obvious that another approach to derive the well known laws of equilibrium might be based on modeling the ligand binding dynamics of a single molecule. In this work, we derive the GCPF for a system with a finite number of binding sites, starting from modeling the binding dynamics. We use a Markov chain model in discrete time and use some reasonable assumptions about the binding dynamics of the molecule to deduce the transition probabilities. This approach facilitates the understanding of the equilibrium distribution, especially the composition of the probabilities of the microstates and provides an idea of how the chemical activity (which is not necessarily bounded by 1) could be interpreted from a stochastic point of view. Moreover, it also allows us to model the system’s way into equilibrium.
2 Binding dynamics of a single molecule as a Markov chain
 [A1] The time between step \(m\) and \(m+1\) is so short that the binding state of only one site can change. Using the \(\ell _1\)Normthis means \(M_{1,m}M_{1,m+1} \le 1\), where, as usual, the difference of the tuples is understood componentwise.$$\begin{aligned} k:=\sum \limits _{i=1}^n k_i \end{aligned}$$

[A2] For \(k,l \in K\) with \(kl = 1\), the probability of a transition \(k \mapsto l\) is composed of three factors:

[A21] the random choice of a binding site that may change its binding state,

[A22] the probability that the environment provides a ligand molecule or takes it up (depending on the state of the chosen site) and

[A23] the probability barrier given by the difference of the energies of microstates \(k,l\) of the target molecule.

[A3] Since the concentration of L is much higher than that of M, we assume that the binding of the ligand to the individual target molecules occurs stochastically independently. This means that the molecules of type M do not interact, and a small reduction of the number of free ligand molecules, due to an uptake by molecules M, does not affect the probability of [A22].

[A21] Since this probability factor describes the choice of a site, there is no need to discriminate between the sites at this point. Consequently, we assume a uniform distribution which means the first factor equals \(\frac{1}{n}\).

[A22] If the chosen site is not occupied, the second factor is given by probability \(\theta _1 \ne 0\), which can be interpreted as the “availability” of the ligand. It incorporates the spatial availability, geometric orientation of the ligand to the binding site and how “costly” it is to decouple the ligand from its environment (e.g. the energy required to remove hydrogen bonds between the ligand and the solvent molecules). In the case of a chosen site being occupied, probability \(\theta _2\) characterizes the barrier of releasing the ligand molecule. In “most” cases \(\theta _2 \ne 0\) can be considered as being equal to \(1\). However, e.g. in supersaturated solutions or due to weak solubility of the ligand, the release of a ligand molecule might be of energetic disadvantage for the environment. Both factors \(\theta _1\) and \(\theta _2\) depend on the ligand concentration and describe the energetic state of the environment.
 [A23] The third and last component \(p_{k,l}\) models the probability barrier given by the energy difference of the target molecule, when a ligand is released or taken up. In contrast to [A22], this factor is not assumed to depend on the environment, i.e. on the energy state of the solution. We will derive a suitable function that depends on the energy levels of the states \(k\) and l: Let \(G(k), G(l)\) denote the energy levels of the states. We are looking for a function \(p_{k,l}:=p(G(k),G(l)) \longrightarrow [0,1]\) with \(p_{k,l}=1\) if \(G(l) \le G(k)\). This means if the energy level is the same, or is reduced by the transition, there will not be an energy barrier that impedes the transition (expressed as a probability). However, if energy is required, i.e. \(G(l) > G(k)\), then \(p_{k,l} < 1\). Since \(p_{k,l}\) is a probability, it can be represented byfor an appropriate nonnegative function \(f(x)\) which is different from the zero function and which depends only on the energy differences. Some properties of \(f\) are reasonable to assume$$\begin{aligned} p_{k,l}= \text{ min }\left( 1,f(G(l)  G(k))\right) \end{aligned}$$(1)$$\begin{aligned}&f(x+y)=f(x) f(y), \end{aligned}$$(2)$$\begin{aligned}&f(x) \in (0,1) \text{ if } \text{ and } \text{ only } \text{ if } x \in (0,\infty )\end{aligned}$$(3)The first property models that an additional energy barrier represents a second factor: The probability of overcoming a barrier \(x+y\) shall be equal to the probability of overcoming \(x\) and subsequently \(y\). This characteristic of function \(f\) is also required for consistency with possible extensions of this model by incorporating intermediate states. The existence of intermediate states leads to a splitting of the energy barriers. The second property expresses that only a transition that requires energy poses a probability barrier. Monotonicity is reasonable, too.$$\begin{aligned}&f \text{ is } \text{ monotone } \end{aligned}$$(4)
Lemma 1
 (a)
A \(\beta \in {\mathbb {R}}^{+}\) exists such that \(f(x)=\text{ exp }(\beta x)\).
 (b)
\(p_{k,l} < 1 \Longrightarrow p_{l,k}=1\).
3 The transition probabilities
Example 1
4 Aperiodicity, connectivity and detailed Balance
We know that the Markov chain with these transition probabilities is aperiodic and connected. The first property can clearly be seen because the system can return to its initial state within one time step, which means it remains in this state, or in two time steps by going there and back. The latter property is also obvious since every state can be reached. Consequently, the Markov chain has a unique stationary distribution \(\pi \) to which the system’s distribution will converge and which we will characterize. If the matrix fulfills the detailed balance condition, we will be able to calculate the stationary distribution quickly, according to the procedure described in the following lemma.
Lemma 2

Choose a reference state \(k\), and define \( \pi _k = 1\).

Calculate the ratios \(\frac{\pi _i}{\pi _k}\) of all pairs \(\{i,k\}\) with \(q_{i,k}\ne 0\) by \(\frac{\pi _i}{\pi _k}= \frac{q_{k,i}}{q_{i,k}}\).

If \(q_{i,k} = 0\) choose any path \((i,\ldots ,k)\) with probability greater than zero and calculate the pairwise ratio.

Normalize the distribution.
Proof
In other words, Lemma 2 states that for a given reference state, the ratio of the probabilities of the stationary distributions are identical to the ratios of the expected flux between two states (pairwise) along any path. This statement is actually one direction of Kolmogorov’s criterion [5]. Even though it is not obvious that the matrix of Example 1 satisfies the detailed balance equation, we will use the procedure of Lemma 2 and show that the obtained distribution is stationary, for the special case of two binding sites.
Example 2
Lemma 3
Proof
Proposition 1
For every number of binding sites n, the matrix of transition probabilities defined by Eqs. (5)–(7) is detailed balanced with respect to its stationary distribution.
Proof
Remark 1
In our model, the probability of a transition from \(k\) to \(l\), with \(kl=1\) is composed of a uniform proposal distribution on the states of the “neighborhood” and of an acceptance rate given by \(\theta _1 p_{i,j}\) or \(\theta _2 p_{i,j}\), depending on the state of the chosen site. Even though this structure resembles the Metropolis–Hastings algorithm [3, 10], our model does not coincide with this algorithm: The factor \(\theta _i\) is not part of the proposal distribution, since otherwise, the proposal probabilities do not sum up to one. Consequently, the acceptance probability is different to the one commonly used, since it is bounded by \(\theta _i\), and not by one.
5 The stationary distribution
Proposition 2
Proof
We know that the Markov chain fulfills the detailed balance condition. Using Lemma 2 with the reference state \(\{0\}^n\), we receive Eq. (10). \(\square \)
Since we assumed the molecules to bind ligands independently [A3], the distribution of the states within the solution in equilibrium will be close to the stationary distribution of a single molecule, due to the Law of Large numbers, if the number of molecules is sufficiently large.
6 Activation energies
7 Comparison to the Grand Canonical Partition Function
8 Decoupled sites
9 Summary and outlook
We presented a derivation of the Grand Canonical Partition Function for a system with a finite number of binding sites, from a model of stochastic ligand binding dynamics of a single molecule. Some assumptions about the process of ligand binging led to a Markov chain model with a matrix of transition probabilities that satisfies the detailed balanced condition. The corresponding stationary distribution coincides with the Grand Canonical Partition Function if we identify the chemical activity \(\varLambda \) with a ratio of two probabilities. The model directly offers the possibility to investigate the dynamics into equilibrium.
Notes
Acknowledgments
We would like to thank Alexander Malinowski for helpful discussions.
References
 1.C.R. Cantor, P.R. Schimmel, Biophysical Chemistry. Part III. The Behavior of Biological Macromolecules, 1st edn. (W. H. Freeman, San Francisco, CA, 1980)Google Scholar
 2.K. Hasselbalch, Die Berechnung der Wasserstoffzahl des Blutes aus der freien und gebundenen Kohlensäure desselben, und die Sauerstoffbindung des Blutes als Funktion der Wasserstoffzahl (Julius Springer, 1916)Google Scholar
 3.W.K. Hastings, Monte carlo sampling methods using markov chains and their applications. Biometrika 57(1), 97–109 (1970)CrossRefGoogle Scholar
 4.L.J. Henderson, The Fitness of the Environment (Macmillan Company, New York, 1913).Google Scholar
 5.F.P. Kelly, Reversibility and Stochastic Networks (Cambridge University Press, Cambridge, MA, 2011)Google Scholar
 6.J.W.R. Martini, M. Schlather, G.M. Ullmann, The meaning of the decoupled sites representation in terms of statistical mechanics and stochastics. MATCH Commun. Math. Comput. Chem. 70(3), 829–850 (2013a)Google Scholar
 7.J.W.R. Martini, M. Schlather, G.M. Ullmann, On the interaction of different types of ligands binding to the same molecule part ii: systems with n to 2 and n to 3 binding sites. J. Math. Chem. 51(2), 696–714 (2013b)CrossRefGoogle Scholar
 8.J.W.R. Martini, M. Schlather, G.M. Ullmann, On the interaction of two different types of ligands binding to the same molecule part i: basics and the transfer of the decoupled sites representation to systems with n and one binding sites. J. Math. Chem. 51(2), 672–695 (2013c)CrossRefGoogle Scholar
 9.J.W.R. Martini, G.M. Ullmann, A mathematical view on the decoupled sites representation. J. Math. Biol. 66(3), 477–503 (2013)CrossRefGoogle Scholar
 10.N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, E. Teller, Equation of state calculations by fast computing machines. J. Chem. Phys. 21(6), 1087–1092 (1953)CrossRefGoogle Scholar
 11.A. Onufriev, D.A. Case, G.M. Ullmann, A novel view of pH titration in biomolecules. Biochemistry 40(12), 3413–3419 (2001)CrossRefGoogle Scholar
 12.J.A. Schellman, Macromolecular binding. Biopolymers 14, 999–1018 (1975)CrossRefGoogle Scholar
 13.C. Tanford, J.G. Kirkwood, Theory of protein tiration curves. I. General equations for impenetrable spheres. J. Am. Chem. Soc. 79(20), 5333–5339 (1957)CrossRefGoogle Scholar
 14.J. Wyman, S.J. Gill, Binding and Linkage: Functional Chemistry of Biological Macromolecules (University Science Books, Mill Valley, 1990)Google Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.