Skip to main content

Evolution of Ideas on Entropy

  • Chapter
  • First Online:
Generalized Statistical Thermodynamics

Abstract

Our stated goal is to develop a general theory of thermodynamics that we may apply to stochastic processes, but what is thermodynamics? At the mathematical level, thermodynamics is the calculus of entropy. The inequality of the second law forms the starting point for a large number of mathematical relationships that are written among the set of primary variables, entropy, energy, volume and number of particles, and a number of defined functions based on the primary set. The mathematical framework of classical thermodynamics is established as soon as the second law is formulated in mathematical form. The subsequent development of statistical mechanics left this framework intact, while making the connection to the microscopic structure of matter. Even as the basic mathematical framework did not change, the revolution lead by Gibbs introduced a statistical view of entropy that opened an entirely new viewpoint. Starting with Shannon’s work on information theory and Jaynes’s formulation of maximum entropy has now escaped from the realm of physics to invade other fields. In this chapter we review the evolution of ideas on entropy, from the classical view, to Gibbs, Shannon, and Jaynes, and cover the background that gives rise to our development in the chapters that follow.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We write S(E, V, N k) as a shorthand for S(E, V, N 1, ⋯ ). Summations over k are understood to go over the total number of components.

  2. 2.

    The index \(a=1,2,\cdots \mathcal N\) counts the number of subsystems to which the larger system is divided. The number \(\mathcal {N}\) of partitions may be arbitrarily large as long as each subsystem contains enough molecules to be treated as a continuum.

  3. 3.

    Equations (1.4) and (1.5) taken together are equivalent to the condition that entropy is homogeneous in E, V , and N k with degree 1.

  4. 4.

    The entropy in Eq. (1.7) refers to an ensemble of microstates under fixed macroscopic state and is expressed in intensive form per microstate.

  5. 5.

    Gibbs derives the variational properties of entropy in a series of theorems in Chap. XI of his book (Gibbs 1902).

  6. 6.

    This derivation is discussed in more detail in Hill (1987). Hill calls this the generalized ensemble and uses it to obtain the familiar ensembles (microcanonical, canonical, and grand canonical) as special cases.

  7. 7.

    A brief review of functionals and variational calculus is given in the appendix of Chap. 7.

  8. 8.

    Shannon used the symbol H to denote the uncertainty of probability distribution. We use S to indicate that we are discussing the same functional as in Gibbs’s treatment, though from a fairly different point of view.

  9. 9.

    As quoted in Tribus and McIrvine (1971).

  10. 10.

    By a priori we mean probabilities that are assigned before we even toss the die to study its outcomes, based on our total knowledge up to that point.

  11. 11.

    This title suggest that entropy maximization is perhaps encoded in the human brain. This is a hyperbole, but Jaynes was mischievously fond of making strong pronouncements. The manuscript, its review, and the author’s response in Jaynes’s inimitable style can be found at https://bayes.wustl.edu/etj/node1.html.

  12. 12.

    See Kapur (1989) for a discussion of many common distributions, discrete and continuous, and how they can be obtained by the maximum entropy method under suitable constraints.

  13. 13.

    Recall the following results from statistical mechanics:

    $$\displaystyle \begin{aligned} \bar E = -\left(\displaystyle\frac{\partial \log Q}{\partial \beta} \right)_{V,N_k},\quad \bar N_k = -\left(\displaystyle\frac{\partial \log \Xi}{\partial \gamma_k} \right)_{V,N_{j\neq k}} , \end{aligned}$$

    where Q(β, V, N k) is the canonical partition function, Ξ(β, V, γ k) is the grand canonical partition function, β = 1∕k B T, and γ k = −μ kk B T. Similar expressions can be written \(\bar E\) in the grand canonical ensemble, \(\bar V\) in the (E, p, N k) ensemble, and so on.

  14. 14.

    With \({\bar x} = \bar E\), β = 1∕k B T, this reverts to \( S = \beta \bar E + \log Q\), a familiar result from thermodynamics.

  15. 15.

    We use a bivariate function to demonstrate homogeneity and Euler’s theorem. The extension to any number of variables is straightforward.

  16. 16.

    In general, if f is homogeneous in x with degree ν, its nth derivative with respect to n is homogeneous with degree ν − k. The proof is left as an exercise.

  17. 17.

    In solution thermodynamic the name Gibbs-Duhem is specifically associated with Eq. (1.55), which is a special case of the above result.

  18. 18.

    We are using the notation F (i, j, ⋯ ) to indicate the Legendre transformation of F(x 1, x 2⋯ ) with respect to x i, x j, ⋯.

  19. 19.

    Boltzmann’s constant on the left-hand side of Eq. (1.49) gives entropy the dimensions of energy over temperature. It is a historical accident that temperature (and heat) was given its own units rather than the same units as energy, which would have set k B = 1 and would have made entropy dimensionless. While the dimensions of heat were eventually corrected to match those of energy, the same was never done for temperature.

References

Download references

Author information

Authors and Affiliations

Authors

Appendix: The Mathematical Calculus of Entropy

Appendix: The Mathematical Calculus of Entropy

The mathematical relationships we recognize as thermodynamics are based on three key concepts: curvature (concavity/convexity), homogeneity, and Legendre transformations. These are briefly reviewed here, then are applied to obtain certain key results in thermodynamics.

1.1.1 Curvature

A function f(x 1, x 2⋯ ) is concave with respect to x 1 if

$$\displaystyle \begin{aligned} f(\lambda_1 x_1+\lambda^{\prime}_1x^{\prime}_1,x_2\cdots) \geq \lambda_1f(x_1,x_2\cdots) + \lambda^{\prime}_1 f(x^{\prime}_1,x_2\cdots) \end{aligned}$$

for all positive λ 1, \(\lambda ^{\prime }_1=1-\lambda _1\) at fixed x 2, ⋯. The second derivative of a concave function is negative:

$$\displaystyle \begin{aligned} \frac{\partial^2 f}{\partial x_1^2} \leq 0 . \end{aligned} $$
(1.32)

For a convex function these inequalities are inverted. If f is concave with respect to several independent variables, the concave inequality applies to each variable at a time, keeping all other variables constant. A multivariate function may have different curvatures with respect to different variables. For example, if f(x 1, x 2) is concave with respect to x 1 and convex with respect to x 2, then

$$\displaystyle \begin{aligned} f(\lambda x_1+(1-\lambda) x_1^{\prime},x_2) \geq a f(x_1,x_2)+(1-\lambda) f(x_1^{\prime},x_2) . \end{aligned} $$
(1.33)

and

$$\displaystyle \begin{aligned} f(x_1,\lambda x_2+(1-\lambda) x^{\prime}_2) \leq a f(x_1,x^{\prime}_2)+(1-\lambda) f(x_1,x^{\prime}_2) . \end{aligned} $$
(1.34)

It is an elementary property that f and (−f) have opposite curvatures: if f is concave, then (−f) is convex, and vice versa.

1.1.2 Homogeneity

A multivariateFootnote 15 function f(x 1, x 2) is homogeneous in x 1 and x 2 with degree ν if

$$\displaystyle \begin{aligned} f(\lambda x_1,\lambda x_2) = \lambda^\nu f(x_1,y_1) \end{aligned} $$
(1.35)

for all λ. We differentiate with respect to λ to obtain

$$\displaystyle \begin{aligned} x_1 \frac{\partial f}{\partial (\lambda x_1)} + x_2 \frac{\partial f}{\partial (\lambda x_2)} = \nu\lambda^{\nu-1} f, \end{aligned} $$
(1.36)

then setting λ = 1,

$$\displaystyle \begin{aligned} x_1 f_1 + x_2 f_2 = \nu f , \end{aligned} $$
(1.37)

where f i is a shortcut notation for the partial derivative with respect to x i. This is Euler’s theorem for homogeneous functions of degree ν. For degree ν = 1 we obtain

(1.38)

The derivatives f 1 and f 2 are homogeneous in x 1 and x 2 with degree 0, i.e.,Footnote 16

$$\displaystyle \begin{aligned} f_i(\lambda x_1,\lambda x_2) = f_i(x_1,x_2). \end{aligned} $$
(1.39)

Taking the differential of f in Eq. (1.38) with respect to all x i and f i we have

$$\displaystyle \begin{aligned} d f = f_1 d x_1+ f_2 d x_2 + f_1 d x_1 + x_2 d f_2, \end{aligned} $$
(1.40)

and since df = f 1 dx 1 + x 2 df 2,

(1.41)

This is the Gibbs-Duhem equation associated with the Euler form in Eq. (1.38).Footnote 17 It expresses the fact that homogeneity, by virtue of being a special condition on f, imposes the constraint that the variations of the partial derivatives are not all independent.

If f(x 1, ⋯x m) is homogeneous with degree 1 only with respect to x 1, ⋯x k, then the Euler and the Gibbs-Duhem equation apply to these variables,

$$\displaystyle \begin{aligned} f = \sum_{i=1}^k x_i f_i,\end{aligned} $$
(1.42)
$$\displaystyle \begin{aligned} \sum_{i=1}^k f_i dx_i = 0,\end{aligned} $$
(1.43)

with the understanding that the remaining variables x k+1, ⋯x m are held constant.

1.1.3 Legendre Transformations

Given a monotonic function f(x 1, x 2, ⋯ ) with partial derivatives f i, i = 1, 2⋯, its Legendre transformation with respect to variable x 1 isFootnote 18

$$\displaystyle \begin{aligned} F^{(1)}(f_1,x_2) = f - x_1 f_1 , \end{aligned} $$
(1.44)

The Legendre transformation takes a function of (x 1, x 2) and turns it into a function of (f 1, x 2), where f 1 is the derivative of the original function with respect to the transformed variable. The usefulness of the Legendre may be appreciated better if we write the differentials of the original and of the transformed functions (the derivations are straightforward and can be found in the standard literature):

$$\displaystyle \begin{aligned} d f &= +f_1 d x_1 + f_2 d x_2 \\ d f F^{(1)} &= -x_1 d f_1 + f_2 d x_2 . \end{aligned} $$

The Legendre transformation with respect to x 1 changes the independent variable from x 1 to f 1, and the corresponding partial derivative from f 1 to − x 1, i.e.,

$$\displaystyle \begin{aligned} \left(\displaystyle\frac{\partial F^{(1)}}{\partial f_1} \right)_{x_2,\cdots} = - \left(\displaystyle\frac{\partial f}{\partial x_1} \right)_{x_2\cdots} .\end{aligned} $$
(1.45)

These derivatives will usually be notated more simply as ∂F (1)∂f 1 and ∂f∂x 1 with the understanding that they are taken with respect to the proper set of independent variables of each function. If we Legendre-transform F (1) with respect to f 1 we obtain the original function f. This makes the Legendre transformation an involution—its inverters transformation is itself. The Legendre transformation can be applied to any subset of variables. For example, transforming f with respect to x 1 and x 2 we obtain

$$\displaystyle \begin{aligned} F^{(1,2)} = f - x_1 f_1 - x_2 f_2,\end{aligned} $$
(1.46)

whose differential is

$$\displaystyle \begin{aligned} d F^{(1,2)} = -x_1 d f_1 - x_2 d f_2 + f_3 d x_3 \cdots . \end{aligned} $$
(1.47)

The Legendre transformation inverts curvature with respect to the transformed variable and preserves it for all untransformed variables. If f(x 1, x 2⋯ ) is concave in all x i, then F (1)(f 1, x 2, ⋯ ) is convex in f 1 and concave in x 2, x 3⋯.

1.1.4 Thermodynamic Relationships

We assert that function S = S(E, V, N k), defined in the positive quadrant of the (E, V, N k) plane, has the following properties: it is positive, concave, and homogeneous with degree 1 with respect to all of its independent variables. These conditions reproduce all relationships of classical thermodynamics, as we demonstrate below.

1.1.4.1 Second Law

We apply the concave inequality to entropy. With λ + λ′ = 1 (λ, λ′≥ 0), we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle S(\lambda E+\lambda' E',V,N_k) \geq \\ &\displaystyle &\displaystyle \qquad \qquad \qquad \qquad \lambda S(E,V,N_k) + \lambda' S(E',V,N_k) = \\ &\displaystyle &\displaystyle \qquad \qquad \qquad \qquad \qquad \qquad \qquad S(\lambda E,\lambda V,\lambda N_k) + S(\lambda' E',\lambda' V,\lambda' N_k) . \end{array} \end{aligned} $$

Setting

$$\displaystyle \begin{aligned} \hspace{60pt} &E_1 =\lambda E, &&V_1 =\lambda V, &&N_{k,1} =\lambda N_k,\\ &E_2 =\lambda' E', &&V_2 =\lambda' V, &&N_{k,2} =\lambda' N_k, \hspace{60pt}\end{aligned} $$

the concave inequality becomes

$$\displaystyle \begin{aligned} S(E_1+E_2, V_1+V_2, N_{k,1}+N_{k,2}) \geq S(E_1, V_1, N_{k,1}) + S(E_2, V_2, N_{k,2}) ,\end{aligned} $$
(1.48)

which is the inequality of the second law. It is always possible to obtain an infinite set of positive V 1, V 2, N k,1, N k,2, that satisfy these equations for any positive V , N k, and any λ, λ′ = 1 − λ, such that 1 ≥ λ ≥ 0. The inequality in Eq. (1.48) therefore applies for any positive U 1, U 2, V 1, V 2, N k,1, and N k,2.

1.1.4.2 Entropy Equation

By Euler’s theorem,Footnote 19

$$\displaystyle \begin{aligned} \frac{S(E,V,N_k)}{k_B} &= E \left(\displaystyle\frac{\partial S}{\partial E} \right)_{V,N_k} + V \left(\displaystyle\frac{\partial S}{\partial V} \right)_{E,N_k} + \sum_k \gamma_k \left(\displaystyle\frac{\partial S}{\partial N_k} \right)_{E,V,N_{j\neq k}} \\ &= \beta E + \alpha V + \sum_k \gamma_k N_k , \end{aligned} $$
(1.49)

where β, α, and γ k are the derivatives of entropy:

$$\displaystyle \begin{aligned} \beta = \frac{1}{k_B T},\quad \alpha = \frac{p}{k_B T},\quad \gamma_k = -\frac{\mu_k}{k_B T} . \end{aligned} $$
(1.50)

These derivatives are homogeneous with degree 0, i.e., they are intensive functions of the state.

1.1.4.3 Thermodynamic Potentials

A family of thermodynamic potentials is obtained by Legendre transforming entropy with respect to various combinations of its independent variables. We give one example: the Gibbs function G, which is commonly defined as

$$\displaystyle \begin{aligned} G = E - TS - pV = E-\frac{S/k_B}{\beta} - \frac{\alpha V}{\beta}. \end{aligned} $$
(1.51)

We express this in the equivalent form

$$\displaystyle \begin{aligned} -\beta G = \frac{S^{(1,2)}(E,V,N_k)}{k_B} \end{aligned} $$
(1.52)

which identifies the product − βG as the Legendre transformation of S(E, V, N k)∕k B with respect to E and V . The product − βG is a function of β, α, and N k (as is G), and its differential is written immediately by application of Eq. (1.47)

$$\displaystyle \begin{aligned} -d\left(\beta G\right) = -E d\beta - V d\alpha + \sum_k \gamma_k d N_k. \end{aligned} $$
(1.53)

Substituting Eq. (1.50) for β, α, and γ k we obtain the more familiar result

$$\displaystyle \begin{aligned} d G = -S d T + V d p + \sum_k \mu_k d N_k . \end{aligned} $$
(1.54)

At fixed T and p, the Gibbs energy is homogeneous in N k with degree 1. From Euler’s theorem for a mixed set of extensive and intensive variables, given in Eq. (1.42), we obtain

$$\displaystyle \begin{aligned} G = \sum_k \mu_k N_k,\quad \text{(const. }p, T).{} \end{aligned} $$
(1.55)

Applying the Gibbs-Duhem equation with respect to all extensive variables while keeping the intensive variables constant we have

$$\displaystyle \begin{aligned} 0 = \sum_k N_k d\mu_k \quad \text{(const. }p, T). \end{aligned} $$
(1.56)

In thermodynamics the Gibbs-Duhem equation is specifically associated with this result. We use Gibbs-Duhem in a more general sense to refer to a condition on the simultaneous variation of intensive properties as a companion of the Euler equation, which governs the variation of the extensive variables.

1.1.4.4 Stability Criteria

The concave condition on S implies that the second derivatives of entropy are negative:

$$\displaystyle \begin{aligned} \left(\frac{\partial^2 S}{\partial E^2}\right)_{V,N_k} \leq 0,\quad \left(\frac{\partial^2 S}{\partial V^2}\right)_{E,N_k} \leq 0,\quad \left(\frac{\partial^2 S}{\partial N_k^2}\right)_{E,V,N_{j\neq k}} \leq 0 . \end{aligned}$$

All stability criteria are obtained by manipulation of these derivatives. To develop stability criteria for the Gibbs energy, we first note that the product − βG (see Eq. (1.52)) is concave with respect to the untransformed variables N k, as is entropy. Accordingly, at fixed β and α the Gibbs energy is convex in N k (the negative sign flips the curvature). Thus the equilibrium Gibbs energy is at a minimum with respect to partitioning of components at fixed β and α, or equivalently, at fixed T and p:

$$\displaystyle \begin{aligned} G(T,p,N_k) \leq G(T,p,\lambda_k N_k) + G(T,p,(1-\lambda_k)N_k). \end{aligned} $$
(1.57)

Similar expressions can be written for the other thermodynamic potentials (energy, free energy, and enthalpy). We will not write them down as our intention is not to reproduce the complete set of thermodynamic relationships but rather to demonstrate how to obtain these results by combining curvature, homogeneity, and Legendre transformations.

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Matsoukas, T. (2018). Evolution of Ideas on Entropy. In: Generalized Statistical Thermodynamics. Understanding Complex Systems. Springer, Cham. https://doi.org/10.1007/978-3-030-04149-6_1

Download citation

Publish with us

Policies and ethics