1 Introduction

There are several widely-accepted formulations of the Second Law in classical thermodynamics, some invoking notions of temperature and entropy, and at least one invoking neither of these explicitly.

In particular, the so-called Kelvin–Planck Second Law is an elemental stricture on the nature of heat receipt by a body during the course of any cyclic process the body might experience. It says, in effect, that during the course of a cyclic process the body cannot merely receive heat from its exterior without also emitting heat to it (in a manner qualitatively distinguishable from that of the heat receiptFootnote 1). The First Law then implies that, over the course of a cyclic process, the heat received by the body cannot be converted entirely into work; there must be some heat emission as well. In the Kelvin–Planck Second Law there is no explicit mention of entropy or of temperature, much less of a thermodynamic temperature scale.

As we indicated, other invocations of the Second Law are explicit in their use of a thermodynamic temperature scale and of an entropy. Indeed the opening paragraph of Gibbs’s “On the Equilibrium of Heterogeneous Substances" [16] invokes an inequality of the form

$$\begin{aligned} \begin{bmatrix} \text {The total entropy}\\ \text {of the body at the}\\ \text {end of the process} \end{bmatrix} \,-\, \begin{bmatrix} \text {The total entropy of the}\\ \text {body at the beginning}\\ \text {of the process} \end{bmatrix} \quad \geqq \quad {\int \frac{\text {d}q}{T}}\ \biggr |_{\;\text {process}} \end{aligned}$$
(1.1)

“dq denoting the element of heat received from external sources and T denoting the temperature of the part of the system receiving it.”. (This interpretation of the right side of (1.1) is taken from that same Gibbs paragraph.)

Much of modern classical thermodynamics takes as its starting point a Second Law of the form (1.1), usually called the Clausius–Duhem inequality, deemed to obtain for any body suffering any process, even processes in which there is rapid heating or cooling, in which there are sharp temperature gradients, and in which there is rapid and severe deformation.Footnote 2 Neither at the start of the process nor at its end need the body be in equilibrium.

This raises some historical and, more importantly, conceptual questions. Entropy and a thermodynamic temperature scale are generally regarded to be derived entities, deduced from more fundamental statements of the Second Law (such as the Kelvin–Planck version) by means of brilliant arguments posited by the early thermodynamics pioneers. Those arguments, however, often invoke idealized slow reversible processes (for example, Carnot cycles) in which the body suffering the process is always in (or arbitrarily close to) a condition of equilibrium.Footnote 3 Because the classically derived notions of entropy and thermodynamic temperature rest upon arguments in which the only body-states visited are ones at or very close to equilibrium, it is reasonable to question whether these notions actually have rigorous logical extensions to non-equilibrium domains.

Gibbs seemed willing to embrace such extensions. A reading of Gibbs’s interpretation of the right side of (1.1) indicates that he had no reluctance to invoke a thermodynamic temperature scale in bodies having different local temperatures in different parts, and in an earlier, less read article [15], Gibbs clearly felt free to attribute an entropy to a body that is not in equilibrium:

When the body is not in a state of thermodynamic equilibrium, its state is not one of those which are represented by our surface. The body, however, as a whole has a certain volume, entropy, and energy, which are equal to the sums of the volumes, etc., of its parts.

Note, in particular, that Gibbs was not reluctant to assert the existence of a local entropy within an un-equilibrated body, its total entropy coming from a summing process.

Yet it is not easy to trace a clear path from the equilibrium arguments for entropy and thermodynamic temperature posited by the early pioneers to the non-equilibrium entropy and temperature invoked by Gibbs. Even less evident is a precise line of argument that begins with the pioneers and terminates with the free-wheeling modern use of local entropy and thermodynamic temperature in the Clausius–Duhem inequality, in particular when it is applied to bodies experiencing rapid, non-uniform heat transfer and deformation.

Our aim is to connect, in a precise way, an elemental Kelvin–Planck statement of the Second Law to the existence and properties of a thermodynamic temperature scale and an entropy scale, both viewed as functions of the local material state, that together satisfy the requirements of the Clausius–Duhem inequality (as it is invoked in modern classical physicsFootnote 4) for all processes that material bodies under consideration are deemed to admit. The mathematical ideas we use, principally from functional analysis, were not available to the earliest pioneers of classical thermodynamics, nor were they available to Gibbs.Footnote 5 Our primary working tool is the Hahn–Banach Theorem, in particular a version that ensures that two non-empty disjoint closed convex sets in a locally convex topological vector space, at least one of them compact, can be strictly separated by a hyperplane [2, 4, 24, 33]. Along the way, the Hahn–Banach Theorem will have the additional benefit of imparting to thermodynamics an intuitive geometric flavor, different in substance and setting from the geometric one pioneered by Gibbs [14, 15].

2 Some Background

This article and its companion [11] constitute a major amplification of two much earlier ones by us, both drawing on the Hahn–Banach Theorem heavily. The first [9], published in 1983, was an extensive discussion of how the Hahn–Banach Theorem serves to connect a suitably formulated version of the Kelvin–Planck Second Law to the existence and properties (including uniqueness) of a thermodynamic temperature scale that conforms to the so-called Clausius inequality—that is, to the Clausius–Duhem inequality restricted to cyclic processes. For cyclic processes, the left side of (1.1) reduces to zero, so there was no involvement of entropy.

The second article [10] was published originally in 1984 as an appendix in [34] and soon after in a collection [28] of short essays about the foundations of thermodynamics. That article indicated how, beginning with a slightly stronger version of a Kelvin–Planck Second Law, the Hahn–Banach Theorem delivers simultaneously both a thermodynamic temperature and an entropy satisfying the requirements of the Clausius–Duhem inequality, not restricted to cyclic processes.

Although [10] contained a Hahn–Banach proof of the equivalence of the Kelvin–Planck Second Law with the existence of thermodynamic-temperature and entropy functions of state suited to the Clausius–Duhem inequality, several other theorems (including two about uniqueness) were merely stated without proofs. Those proofs we said would be forthcoming in a fuller article. Moreover, we promised a more compelling presentation of the existence argument, in which certain presumptions about the structure of the putative set of thermodynamical processes would be substantially weakened. This article and its companion are intended to fulfill those promises. The weakened, and more natural, assumptions about the structure of the process-set have required a deeper analysis, much of it deferred to the appendix of this article.

Remark 2.1

The 1986 volume [28], in which [10] appears, contains a wealth of chapters by different authors devoted to the study of the mathematical foundations of non-equilibrium classical thermodynamics. The same is true of [34]. Even beyond those, there are many schools of thought about how classical thermodynamics might be extended to non-equilibrium settings. These are surveyed amply and critically in the 2008 book by Lebon et al. [19] (although some work contained in [28, 34] and related articles escaped the book’s notice). Readers of this article might want to see very different work by Lieb and Yngvason [20], which in 1999 began as an exploration of the construction of classical entropy for bodies in equilibrium and then turned in 2013 to questions about the extent to which the same could be done for un-equilibrated bodies [21]. For a recent summary of some of their work see [35]; see also an article by Kammerlander and Renner [17].

Remark 2.2

To a great extent, discussions with James Serrin in the late 1970s and early 1980s, in particular his formulations of the Second Law in terms of a heat accumulation function, provided inspiration for our work (although not our reliance on the Hahn–Banach theorem). Serrin’s views at the time are captured in [25,26,27, 29].

Remark 2.3

As in our earlier articles, we want to call particular attention to work [30, 31] by Miroslav Šilhavý,Footnote 6 who realized independently and at about the same time that Hahn–Banach separation theorems, taken with the Kelvin–Planck Second Law, might provide a basis for existence of a thermodynamic temperature scale consistent with the cyclic-process Clausius inequality. In [31] Šilhavý viewed the thermodynamic temperature scale to be a function having as its domain a pre-supposed empirical temperature scale. The most apt comparison to our work is with some preliminary notes [8] we wrote in 1978 for James Serrin. There, we also viewed a Clausius-inequality temperature scale to be a function having as its domain a pre-supposed empirical temperature scale, and we too used Hahn–Banach separation theorem arguments to demonstrate how the existence of such a Clausius-inequality temperature scale derives immediately from, and is equivalent to, the Kelvin–Planck Second Law.

Our subsequent published article [9] on the Clausius inequality was much more ambitious. There, we chose not to pre-suppose an empirical temperature scale, carrying a pre-ordained notion of “hotness" and “hotter than." Rather, we regarded the desired Clausius-inequality temperature scale to be a “function of state,” the state domain depending on the material under consideration.Footnote 7 In this way, we could not only establish, via the Hahn–Banach Theorem, the equivalence of the Kelvin–Planck Second Law with the existence of a temperature scale satisfying the Clausius inequality, we could also tie relative values of that temperature to a “hotter than" relation on the set of states, a relation deriving solely from processes the material is deemed to admit. This is the position taken in [10] and here, where the entropy density, like the thermodynamic temperature scale, is a Hahn–Banach-derived function of the local material state.Footnote 8

3 Thermodynamical Theories

To a great extent modern classical thermodynamics manifests itself as a collection of thermodynamical theories tailored to particular materials, these various theories sharing common premises and common methodologies. There are, for example, thermodynamical theories of elastic materials, of gases, of viscous fluids, of diffusive reacting mixtures, and so on. Each such theory presumably carries with it versions of the First and Second Laws, rendered concrete and precise within the context of the specific class of materials under study.

With this viewpoint in mind, we regard the theorems contained in this article and its companion to provide something like a “meta-thermodynamics" that sheds an overarching light on the structure of specific thermodynamical theories. In particular, almost all of the theorems contained here assert that a theory has Property A (usually a statement about the nature of heat transfer between bodies and their exteriors in processes the theory admits) if and only if it has Property B (usually a statement about entropy and thermodynamic temperature). The deeper and more difficult of those implications always derives from the Hahn–Banach Theorem.

We will regard a thermodynamical theory to be a mathematical object consisting of two sets: (i) a state space \(\Sigma \) that characterizes the set of (local) states that might be exhibited within a material body embraced by the theory and (ii) a set \(\mathscr {P}\)  of processes that abstracts the essential features of physical processes that such bodies are deemed to admit. Taken together, these two sets will, for us, serve to constitute an instance \((\Sigma ,\mathscr {P})\) of a thermodynamical theory.

In this section and the next we will use terms such as body, material, material point, and physical process, but only in an informal way to guide thinking about the two sets \(\Sigma \) and \(\mathscr {P}\) that constitute a thermodynamical theory or to provide justification for the structure these sets are presumed to possess. Again, though, a thermodynamical theory \((\Sigma ,\mathscr {P})\) is a purely mathematical object suited to precisely stated questions and theorems. In particular, we will be in a position to say what we mean by a Kelvin–Planck theory—that is, a thermodynamical theory that complies with a precisely stated version of the Kelvin–Planck Second Law. And we will be a position to ask about circumstances under which a particular thermodynamical theory \((\Sigma ,\mathscr {P})\) admits two functions of state—a specific-entropy \(\eta :\Sigma \rightarrow \mathbb {R}\) and a thermodynamic temperature scale \(T:\Sigma \rightarrow \mathbb {R}_+\) that together comply with the Clausius–Duhem inequality for all processes \(\mathscr {P}\) the theory contains.

Remark 3.1

The mathematical objects and theorems contained here lend themselves to a variety of physical interpretations. At least at the outset, it will be helpful for the reader to think of a thermodynamical theory \((\Sigma ,\mathscr {P})\) as a description of a particular material (for example, carbon dioxide, water, rubber, a metal alloy, a diffusive reacting mixture). In this context, a specific-entropy function \(\eta :\Sigma \rightarrow \mathbb {R}\) will have an interpretation as an attribute of a particular material—in the parlance of continuum physics, a “constitutive function” for that material. Nevertheless, we intend the abstract idea of a thermodynamical theory to be broadly adaptable to a variety of circumstances and instances.

3.1 State Spaces

Central to virtually all classical theories of material body behavior is the idea of “functions of state" that serve to compute local values of certain material attributes. Indeed, one of our aims is to establish, from the Kelvin–Planck Second Law, the existence of specific-entropy and thermodynamic-temperature functions, suited to the Clausius–Duhem inequality, that permit the calculation of the local specific entropy (entropy per mass) and the local thermodynamic temperature once the local material “state” is specified.

Just how the “state of a material point” is specified will vary from one thermodynamical theory to another.Footnote 9 For a theory of a gas of fixed composition it might be supposed that the local state is captured completely by specification of the pair (pv), where p is the local pressure and v is the local specific volume (the reciprocal of the density). For an elastic material, it might be supposed that the local state is captured by the pair (uF), where u is the local specific internal energy (internal energy per mass) and F is the local deformation gradient. For a reacting and diffusive mixture having n chemical species, the local state might be described by the vector \([c_1,c_2,\dots ,c_n,\theta ] \in \mathbb {R}^{n+1}\), where \(c_i\) is the local molar concentration of the \(i^{th}\) species and \(\theta \) is the local temperature in degrees Fahrenheit.

In any case, we shall take for granted that a thermodynamical theory has associated with it a state space \(\Sigma \), understood to be the set of local states that might be exhibited within a material body during processes the theory purports to describe. It will be presumed that \(\Sigma \) carries with it a Hausdorff topology.

In fact, we will go further by supposing hereafter that \(\Sigma \) is compact. This supposition will simplify the mathematics greatly, and in most instances it will be physically apt: A well-grounded theory would suffer no loss from exclusion of processes that visit material states which are physically unreasonable. Excluded from consideration, for example, might be processes involving mass-densities so high as to be realized only in black holes or so low as to be inconsistent with the tenets of continuum models.

Remark 3.2

When the state space is merely presumed to be locally compact, realization of the objectives of this paper become more technically delicate, and certain theorems here become false without modification. In Appendix E of [9] we showed how this might proceed when attention is restricted solely to cyclic processes, with the aim of producing a thermodynamic temperature scale consistent with the Clausius inequality.

3.2 Processes

A process experienced by a particular body can be described in a variety of ways, some highly picturesque, involving pulleys and pistons. For our purposes, however, there will be only two aspects of the process that need be considered: (i) the change of condition of the body from the beginning of the process to its end and (ii) the heating measure for the process, which is an overall accounting of the nature of heat receipt the body experiences during the course of the process. We will describe each of these separately. For us, a process will be identified with specifications of both its change of condition and its heating measure.

3.2.1 The Change of Condition for a Process

Recall that members of \(\Sigma \) are understood to be local state descriptions—that is, candidates for describing the state of a material point within a body. If we consider a body at a fixed instant, its material points will be exhibited in various states of \(\Sigma \). Although there might be just one state exhibited throughout the body (in which case the body is thermodynamically uniform), the distribution of states over the body could be far more diffuse. In any case, we shall need a device to describe that distribution for a particular body at a fixed instant.

By the (instantaneous) condition of the body we mean a positive regular Borel measure on \(\Sigma \), denoted here by \(\mathcal {m}\), interpreted in the following way: For each Borel set \(\Lambda \subset \Sigma \), \(\mathcal {m}(\Lambda )\) is the mass of that part of the body consisting of all material points in states contained in \(\Lambda \). More colloquially, we can think of \(\mathcal {m}(\Lambda )\) to be determined by excising from the body only material in states contained within \(\Lambda \) and weighing that part of the body so removed. Note that \(\mathcal {m}(\Sigma )\) is the mass of the entire body. Note also that if a body of mass M is thermodynamically uniform, with all material in state \(\sigma \), then the body’s condition is \(M\delta _{\sigma }\), where \(\delta _{\sigma }\) is the Dirac measure concentrated at \(\sigma \).Footnote 10

Now consider a physical process suffered by a particular body, with both the body and the process presumably embraced by the thermodynamical theory under consideration. During the process, the body might experience rapid deformation and heat treanser, so that each material point within the body might present itself in a great variety of states as the process ensues. In particular, the body’s final condition \(\mathcal {m}_f\) might be very different from the body’s initial condition \(\mathcal {m}_i\). We associate with the process a change of condition, \(\Delta \mathcal {m}\) defined by

$$\begin{aligned} \Delta \mathcal {m}:= \mathcal {m}_f - \mathcal {m}_i. \end{aligned}$$
(3.1)

Here \(\Delta \mathcal {m}\) is understood to be a signed regular Borel measure on \(\Sigma \), which is to say that \(\Delta \mathcal {m}\) might take positive values on some Borel sets and negative values on others.Footnote 11 Note, however, that we always have

$$\begin{aligned} \Delta \mathcal {m}(\Sigma ) = \mathcal {m}_f(\Sigma ) - \mathcal {m}_i(\Sigma ) = 0, \end{aligned}$$
(3.2)

since each term on the right is the (conserved) total mass of the body suffering the process.

3.2.2 The Heating Measure for a Process

During the course of the physical process under consideration, the body suffering the process might experience deformation and nonuniform transfer of heat to and from its exterior. Indeed, at a given instant there might be heat receipt in some parts of the body and heat removal in other parts. It should be kept in mind that each material point can be expected to visit a variety of states in \(\Sigma \) as time progresses.

With the process we associate a heating measure \(\mathcal {q}\), which is a signed regular Borel measure on \(\Sigma \) with the following interpretation: for each Borel set \(\Lambda \subset \Sigma \), \(\mathcal {q}(\Lambda )\) is the net amount of heat received over the course of the entire process (from the exterior of the body suffering the process) by material in states contained within \(\Lambda \) at the time of heat receipt. In colloquial terms, imagine viewing the evolving process through glasses that filter out material not in states contained in \(\Lambda \); some material might disappear and then reappear. The net heat received, over the entire process, by the visible material (from the exterior of the entire body) is \(\mathcal {q}(\Lambda )\).

3.2.3 Example: Change of Condition and Heating Measure Derived from a More Concrete Process Description

Because the abstract idea of a process’s change of condition and heating measure will be important hereafter,Footnote 12 we will indicate how these can be calculated from a somewhat more tangible description of a process. With the process (having a compact metric space as the state space \(\Sigma \)) we associate:

  1. (i)

    a body \(\mathscr {B}\) that experiences the process. Here we regard \(\mathscr {B}\) to be a set (of material points), taken with a \(\sigma \)-algebra of subsets of \(\mathscr {B}\), called the parts of \(\mathscr {B}\). We presume that \(\mathscr {B}\) comes equipped with a positive mass measure \(\mu \) defined on its parts: for each part \(P \in \mathscr {B}\), \(\mu (P)\) is the mass of part P.

  2. (ii)

    a closed interval of the real line \(\mathscr {I}:= [t_i,t_f]\), identified with the time interval over which the process transpires.

  3. (iii)

    a measurableFootnote 13 function \({\hat{\sigma }}: \mathscr {B}\times \mathscr {I}\rightarrow \Sigma \), with \({\hat{\sigma }}(X,t)\) interpreted as the state of material point X at instant t.

  4. (iv)

    a real-valued signed measure h on \(\mathscr {B}\times \mathscr {I}\), interpreted as follows: For each part \(P \subset \mathscr {B}\) and each Lebesgue-measurable set \(J \subset \mathscr {I}\), h(PJ) is the net amount of heat received by part P from the exterior of the body during instants contained in J.

For a process described this way, construction of the heating measure \(\mathcal {q}\) proceeds as follows: for each Borel set \(\Lambda \subset \Sigma \),

$$\begin{aligned} \mathcal {q}(\Lambda ):= h({\hat{\sigma }}^{-1}(\Lambda )). \end{aligned}$$
(3.3)

To construct the change of condition for the process we begin by defining the initial and final state assignments to material points:

$$\begin{aligned} {\hat{\sigma }}_i(\cdot ):= {\hat{\sigma }}(\cdot ,t_i) \quad \text {and} \quad {\hat{\sigma }}_f(\cdot ):= {\hat{\sigma }}(\cdot ,t_f). \end{aligned}$$
(3.4)

The initial condition and final condition of body \(\mathscr {B}\) are then defined by the requirement that, for each Borel set \(\Lambda \subset \Sigma \),

$$\begin{aligned} \mathcal {m}_i(\Lambda ) = \mu ({\hat{\sigma }}_i^{-1}(\Lambda )) \quad \text {and}\quad \mathcal {m}_f(\Lambda ) = \mu ({\hat{\sigma }}_f^{-1}(\Lambda )). \end{aligned}$$
(3.5)

The change of condition for the process is then given by

$$\begin{aligned} \Delta \mathcal {m}:= \mathcal {m}_f - \mathcal {m}_i. \end{aligned}$$
(3.6)

3.2.4 The Set of Processes and Some of Its Properties

In a theory with state space \(\Sigma \), a process will be regarded to be a pair \((\Delta \mathcal {m},\mathcal {q})\), where \(\Delta \mathcal {m}\) is the change of condition for the process and \(\mathcal {q}\) is its heating measure. We can regard both of these as members of \(\mathscr {M}(\Sigma )\), the vector space of signed regular Borel measures on \(\Sigma \). In fact, from (3.2) it follows that \(\Delta \mathcal {m}\) is always a member of the linear subspace \(\mathscr {M}^{\circ }(\Sigma )\subset \mathscr {M}(\Sigma )\) defined by

$$\begin{aligned} \mathscr {M}^{\circ }(\Sigma ):= \{\nu \in \mathscr {M}(\Sigma )\,\ \nu (\Sigma ) = 0\}. \end{aligned}$$
(3.7)

Thus we can regard a process \(\mathcal {p}\)  = \((\Delta \mathcal {m},\mathcal {q})\) to be a member of the vector space

$$\begin{aligned} \mathscr {V}(\Sigma ):= \mathscr {M}^{\circ }(\Sigma )\oplus \mathscr {M}(\Sigma ). \end{aligned}$$
(3.8)

Hereafter it will be understood that \(\mathscr {M}(\Sigma )\) carries the weak-star topology,Footnote 14 that \(\mathscr {M}^{\circ }(\Sigma )\) carries the topology it inherits as a subset of \(\mathscr {M}(\Sigma )\), and that \(\mathscr {V}(\Sigma )\) carries the resulting product topology. For a set \(X \in \mathscr {V}(\Sigma )\) we denote by cl (X) its closure.

For a thermodynamical theory with state space \(\Sigma \), the set of processes, \(\mathscr {P}\ \subset \mathscr {V}(\Sigma )\), will be understood to consist of members of \(\mathscr {V}(\Sigma )\) that correspond to physical processes deemed to be admitted by material bodies in circumstances the theory purports to embrace. Physical considerations suggest that, for any reasonable theory, the set \(\mathscr {P}\) should carry a certain structure, in particular that it should share at least some of the attributes of a convex cone in \(\mathscr {V}(\Sigma )\). Recall that \(\mathscr {P}\) would be a convex cone were it to have both of the following properties:

  1. (i)

    For each \(\mathcal {p}\) in \(\mathscr {P}\) and each non-negative number \(\alpha \), \(\alpha \mathcal {p}\) is a member of \(\mathscr {P}\).

  2. (ii)

    For all \(\mathcal {p}\) and \(\mathcal {p}^*\) in \(\mathscr {P}\), \(\mathcal {p}+ \mathcal {p}^*\) is a member of \(\mathscr {P}\).

With respect to (i), it is not difficult to argue on physical grounds that that the inclusion will be satisfied so long as \(\alpha \) is a non-negative integer: If \(\mathcal {p}\)= \((\Delta \mathcal {m},\mathcal {q})\) is a physical process suffered by a body \(\mathscr {B}\), then for any positive integer n, we can simultaneously execute the same process on n copies of \(\mathscr {B}\), copies that are not in thermal communication. The n bodies, viewed as a single body, will have suffered a physical process for which the change of condition is \(n\Delta \mathcal {m}\) and the heating measure is \(n\mathcal {q}\). Thus, \(n\mathcal {p}= (n\Delta \mathcal {m}, n\mathcal {q})\) is a member of \(\mathscr {P}\), corresponding to the physical n-body process described.

Similarly, we can expect on physical grounds that the inclusion in (ii) will be satisfied so long as \(\mathcal {p}\) = \((\Delta \mathcal {m},\mathcal {q})\) and \(\mathcal {p}^* =(\Delta \mathcal {m}^*,\mathcal {q}^*)\) correspond to two physical processes having the same temporal duration: If these physical processes are suffered by bodies \(\mathscr {B}\) and \(\mathscr {B}^*\), then the two processes can be executed simultaneously, with \(\mathscr {B}\) and \(\mathscr {B}^*\) thermally isolated from one another, perhaps by large physical distance. This simultaneous execution can be viewed to be another physical process, suffered by the body composed of \(\mathscr {B}\) and \(\mathscr {B}^*\), having change of condition \(\Delta \mathcal {m}+ \Delta \mathcal {m}^*\) and heating measure \(\mathcal {q}+ \mathcal {q}^*\). In this case, the new physical process would have a representation in \(\mathscr {V}(\Sigma )\) (and in \(\mathscr {P}\)) given by \(\mathcal {p}+ \mathcal {p}^*\).

These considerations tell us that, in a reasonable theory, the process set \(\mathscr {P}\) can be expected to have some natural structure, including features that are suggestive of a convex cone in \(\mathscr {V}(\Sigma )\). In fact, in [10] we assumed that \(\mathscr {P}\) is a convex cone. Here we make no such assumption.

We defer to the Appendix a far more nuanced discussion of the structure that we will suppose \(\mathscr {P}\) possesses. By \(\textrm{Cone} \,(\mathscr {P})\) we mean the set in \(\mathscr {V}(\Sigma )\) defined by

$$\begin{aligned} \textrm{Cone} \,(\mathscr {P}):= \{\alpha \mathcal {p}\in \mathscr {V}(\Sigma ):\ \mathcal {p}\in \mathscr {P}, \alpha \geqq 0\}. \end{aligned}$$
(3.9)

Based on a few plausible physical assumptions, we argue in the Appendix that, in a reasonable theory, the set

$$\begin{aligned} {\hat{\mathscr {P}}}:= \text {cl}\,(\textrm{Cone} \,(\mathscr {P})) \end{aligned}$$
(3.10)

should not only be a closed cone in \(\mathscr {V}(\Sigma )\), it should also be convex. This we will take for granted hereafter.

3.3 Definition of a Thermodynamical Theory

For the record, we posit the following definition:

Definition 3.3

A thermodynamical theory consists of a (compact) Hausdorff set \(\Sigma \), called the state space of the theory, and a set \(\mathscr {P}\subset \mathscr {V}(\Sigma )\) such that

$$\begin{aligned} {\hat{\mathscr {P}}}:= \text {cl}\,(\textrm{Cone} \,(\mathscr {P})) \end{aligned}$$
(3.11)

is convex. Elements of \(\mathscr {P}\) are the processes of the theory.

Remark 3.4

The definition is formulated in such a way as to remind the reader of our presumption that \(\Sigma \) is compact. Recall Remark 3.2.

4 Kelvin–Planck Theories

In this section we will make precise what we mean by a Kelvin–Planck theory—that is, a thermodynamical theory that respects a form of the Kelvin–Planck Second Law. We want to capture the following idea: In every cyclic process in which the body suffering the process experiences a heat absorption from the body’s exterior, there must also be heat emission to the exterior, the emission being qualitatively different from the absorption. If there were there no heat emission, the process would be perfectly efficient, for by the First Law the heat absorbed would be converted entirely into work.

By a cyclic process in the thermodynamical theory \((\Sigma ,\mathscr {P})\) we will mean a process in which the condition of the body at the end of the process is the same as it was at its beginning. That is, a cyclic process \(\mathcal {p}= (\Delta \mathcal {m},\mathcal {q})\) is a process such that the change of condition \(\Delta \mathcal {m}\) is 0.

Consider a cyclic process \(\mathcal {p}^*:= (0,\mathcal {q}^*)\) with \(\mathcal {q}^* \ne 0\). Recall that if \(\Lambda \subset \Sigma \) is a Borel set of states, then \(\mathcal {q}^*(\Lambda )\) is interpreted to be the net amount of heat absorbed during the course of the entire process by material while in states contained in \(\Lambda \). If \(\mathcal {q}^*\) is a non-negative Borel measure—that is, one that takes non-negative values on every Borel set, then there is no Borel set of states that, for the process, can be associated with net heat emission. Moreover, by supposition \(\mathcal {q}^*\) is not the zero measure, so there is at least one Borel set on which \(\mathcal {q}^*\) is positive, corresponding to heat absorption.

For these reasons, when \(\mathcal {q}^* \ne 0\) is a non-negative measure, we will regard the cyclic process \(\mathcal {p}^*:= (0,\mathcal {q}^*)\) to be inconsistent with the spirit of the Kelvin–Planck Second Law. For the thermodynamical theory \((\Sigma ,\mathscr {P})\) we denote by \(\mathscr {M}_+(\Sigma )\) the set of non-negative regular Borel measures on \(\Sigma \), and we also let

$$\begin{aligned} (0,\mathscr {M}_+(\Sigma )):= \{ (0,\mathcal {v}) \in \mathscr {V}(\Sigma ): \mathcal {v}\in \mathscr {M}_+(\Sigma )\}. \end{aligned}$$

Thus, for a thermodynamical theory \((\Sigma ,\mathscr {P})\) we might regard the requirement

$$\begin{aligned} \mathscr {P}\, \cap \, (0,\mathscr {M}_+(\Sigma ))\; \text {is at most}\; (0,0) \end{aligned}$$
(4.1)

to be a full embodiment of the Kelvin–Planck Second Law. Or, if we want to assert that a nonzero element of \((0,\mathscr {M}_+(\Sigma ))\) cannot even be approximated by the theory’s processes, then we might strengthen (4.1) by requiring that

$$\begin{aligned} \text {cl}\,(\mathscr {P})\, \cap \, (0,\mathscr {M}_+(\Sigma ))\; \text {is at most}\; (0,0). \end{aligned}$$
(4.2)

However, two examples will reveal a sense in which even (4.2) falls a little short of capturing the Kelvin–Planck stricture against an approach to perfect conversion of heat into work in cyclic processes. The examples will indicate why we prefer to express the Kelvin–Planck Second Law in terms of a requirement that is somewhat stronger than (4.2).

Each example will be in the form of a toy thermodynamical theory in which the state space \(\Sigma \) is identified with the real interval [0, 1]. Recall that, for \(x \in \Sigma \), \(\delta _x\) denotes the Dirac measure at x. That is, if \(\Lambda \subset \Sigma \) is a Borel set then \(\delta _x(\Lambda ) = 1\) if x is in \(\Lambda \) and is zero otherwise.

Example 4.1

(A sequence of cyclic processes with small fixed heat emission but unbounded heat receipt) Consider a thermodynamical theory \((\Sigma ,\mathscr {P})\), in which \(\mathscr {P}\) contains the sequence of cyclic processes

$$\begin{aligned} \{(0,n\delta _1 - \delta _0): n = 1,2,...\}. \end{aligned}$$
(4.3)

In each process of the sequence there is heat absorbed (by material in state 1) and heat emitted (by material in state 0). Thus, no process of the sequence is a member of the forbidden set \((0,\mathscr {M}_{+}(\Sigma ))\), nor does the sequence converge to any nonzero member of the forbidden set. For this reason, a putative assertion of the Kelvin–Planck Second Law in the form (4.2) would not preclude for the theory \((\Sigma ,\mathscr {P})\) the presence of the sequence (4.3) in \(\mathscr {P}\).

Nevertheless, the sequence contains cyclic processes that come arbitrarily close to having perfect efficiency as n increases: In each process, the heat absorbed (all at state 1) is n, while the work done (equal, in a cyclic process, to the net amount of heat received) is \(n-1\). The efficiency, then, is \(\frac{n-1}{n}\), which approaches 1 as n gets large. Although members of the sequence (4.3) do not converge to a member of the forbidden set, they do come close to aligning in the vector space \(\mathscr {V}(\Sigma )\) with the forbidden element \((0,\delta _1)\).

Such an arbitrarily close approach to perfect efficiency would seem to violate the spirit of the Kelvin–Planck Second Law. The example reveals a sense in which the condition expressed by (4.2) is not a fully suitable reflection of that spirit.

Example 4.2

(A sequence of almost-cyclic processes, each with heat receipt but no heat emission) Consider a thermodynamical theory \((\Sigma ,\mathscr {P})\), in which \(\mathscr {P}\) contains the sequence of processes

$$\begin{aligned} \{(\delta _{1/n} - \delta _0,n\delta _{1/2}): n = 1,2,...\}. \end{aligned}$$
(4.4)

In each process of the sequence, the heating measure indicates no heat emission, only (unbounded) heat absorption, entirely at state \(\frac{1}{2}\). Still, no process of the sequence constitutes a violation of a Kelvin–Planck-type Second Law, as no process is cyclic. Nevertheless, as n increases the change of condition approaches 0 while the heat absorption becomes unbounded. Although the sequence does not converge to any member of the forbidden set \((0,\mathscr {M}_{+}(\Sigma ))\), its processes nevertheless violate the Kelvin–Planck spirit, for as n increases they increasingly resemble cyclic processes with (large) heat absorption but no heat emission.

Here, as in Example 4.1, a codification of the Kelvin–Planck Second Law in the form (4.2) does not suffice to preclude the presence in \(\mathscr {P}\) of a troubling process sequence, in this case (4.4).

Stated informally, the difficulty in both examples is that, while neither sequence converges to an element of the forbidden set \((0,\mathscr {M}_+(\Sigma ))\), members of each sequence come arbitrarily close to pointing along a “forbidden direction” in the vector space \(\mathscr {V}(\Sigma )\). For a thermodynamical theory \((\Sigma ,\mathscr {P})\), we will identify the direction of a process \(\mathcal {p}\in \mathscr {P}\) with the half-line

$$\begin{aligned} \{\alpha \mathcal {p}\in \mathscr {V}(\Sigma ): \alpha \geqq 0\}. \end{aligned}$$
(4.5)

Note that \(\textrm{Cone} \,(\mathscr {P})\), given as before by

$$\begin{aligned} \textrm{Cone} \,(\mathscr {P}):= \{\alpha \mathcal {p}\in \mathscr {V}(\Sigma ):\ \mathcal {p}\in \mathscr {P}, \alpha \geqq 0\}, \end{aligned}$$
(4.6)

is the set of all directions generated by members of \(\mathscr {P}\). The condition

$$\begin{aligned} \text {cl}\,(\textrm{Cone} \,(\mathscr {P}))\ \cap \ (0,\mathscr {M}_{+}(\Sigma )) = (0,0) \end{aligned}$$
(4.7)

then says in effect that no nonzero element of the forbidden set (0,\(\mathscr {M}_{+}(\Sigma )\)) can be approximated by vectors of \(\mathscr {V}(\Sigma )\) having directions associated with members of \(\mathscr {P}\).

Remark 4.3

(Examples 4.1and 4.2reconsidered) Although the problematic thermodynamical theories considered in Examples 4.1 and 4.2 were not precluded by the putative Kelvin–Planck Second Law in the form (4.2), they are precluded by the strengthened condition (4.7). In the case of Example 4.1 the sequence in Cone (\(\mathscr {P}\))

$$\begin{aligned} \{(0,\delta _1 - \frac{1}{n}\delta _0): n = 1,2,...\} \end{aligned}$$

converges to \((0,\delta _1)\). In the case of Example 4.2 the sequence in Cone (\(\mathscr {P}\))

$$\begin{aligned} \{(\frac{1}{n}[\delta _{1/n} - \delta _0],\delta _{1/2}): n = 1,2,...\} \end{aligned}$$

converges to \((0,\delta _{1/2})\).

For these reasons, our preferred codification of the Kelvin–Planck Second Law will take the form (4.7) rather than (4.2). Note that if \(\mathscr {P}\) is itself a cone then there is no difference between (4.7) and (4.2). Recall that in Definition 3.3 (the definition of a thermodynamical theory \((\Sigma ,\mathscr {P})\)) we let

$$\begin{aligned} {\hat{\mathscr {P}}}:= \text {cl}\,(\textrm{Cone} \,(\mathscr {P})). \end{aligned}$$
(4.8)

Definition 4.4

A Kelvin–Planck theory is a thermodynamical theory \((\Sigma ,\mathscr {P})\) such that

$$\begin{aligned} {\hat{\mathscr {P}}}\ \cap \ (0,\mathscr {M}_{+}(\Sigma )) = (0,0). \end{aligned}$$
(4.9)

5 Hahn–Banach Equivalence of the Kelvin–Planck Second Law and the Existence of Entropy-Temperature Functions of State

The following theorem asserts that, for a thermodynamical theory, compliance with the Kelvin–Planck Second Law is equivalent to the existence of two continuous functions of state, a specific-entropy function and a thermodynamic temperature scale that, taken together, satisfy the Clausius–Duhem inequality for all processes the theory contains. Entropy and thermodynamic temperature emerge simultaneously and almost immediately as a direct consequence of the Hahn–Banach theorem. There is no reliance at all on venerable thermodynamic conceptual machinery in the form of reversible processes, Carnot cycles, heat baths, or even the idea of equilibrium.

In the theorem statement \(\text {C}(\Sigma ,\mathbb {R})\) denotes the set of real-valued continuous functions on \(\Sigma \), and \(\text {C}(\Sigma ,\mathbb {R}_+)\) is the set of positive-valued continuous functions. \(\mathbb {R}_+\) denotes the set of strictly positive real numbers.

Theorem 5.1

(Existence of Entropy and Thermodynamic Temperature) For a thermodynamical theory \((\Sigma ,\mathscr {P})\) the following are equivalent:

(i) \((\Sigma ,\mathscr {P})\) is a Kelvin–Planck theory.

(ii) There exist functions \(\eta \in \text {C}(\Sigma ,\mathbb {R})\) and \(T \in \text {C}(\Sigma ,\mathbb {R}_+)\) such that

$$\begin{aligned} \int _{\Sigma }\eta \, d(\Delta \mathcal {m})\ \geqq \ \int _{\Sigma }\frac{d\mathcal {q}}{T}, \quad \forall \ (\Delta \mathcal {m},\mathcal {q}) \in \mathscr {P}. \end{aligned}$$
(5.1)

Proof of Theorem 5.1 will make use of some fairly straightforward adaptations of ideas (see, for example, [4]) in functional analysis that were unavailable to the thermodynamics pioneers: First, \(\mathscr {V}(\Sigma )\) is a locally convex Hausdorff topological vector space. Second, the compactness of \(\Sigma \) ensures that the convex set

$$\begin{aligned} (0,\mathscr {M}_{+}^1(\Sigma )):= \{(0,\mathcal {v}) \in \mathscr {V}(\Sigma ): \mathcal {v}\in \mathscr {M}_{+}(\Sigma ), \mathcal {v}(\Sigma ) = 1\} \end{aligned}$$
(5.2)

is (weak-star) compact. Finally, if \(f:\mathscr {V}(\Sigma )\rightarrow \mathbb {R}\) is a continuous linear function, then there exist functions \(\alpha (\cdot )\) and \(\beta (\cdot )\) in \(\text {C}(\Sigma ,\mathbb {R})\) such that, for every \((\mathcal {v},\mathcal {w}) \in \mathscr {V}(\Sigma )\),

$$\begin{aligned} f(\mathcal {v},\mathcal {w}) = \int _{\Sigma }\alpha \, d\mathcal {v}+ \int _{\Sigma }\beta \, d\mathcal {w}. \end{aligned}$$
(5.3)

What follows is the version of the Hahn–Banach theorem that underlies almost all theorems in this article and its companion article, [11].

Theorem 5.2

(Hahn–Banach) Let V be a Hausdorff locally convex topological vector space, and let A and B be non-empty disjoint closed convex subsets of V, with B compact. There is a continuous linear function \(f: V \rightarrow \, \mathbb {R}\) and a number \(\gamma \in \mathbb {R}\) such that

$$\begin{aligned} f(a)\ <\ \gamma ,\ \forall \ a \in A \end{aligned}$$

and

$$\begin{aligned} f(b)\ >\ \gamma ,\ \forall \ b \in B. \end{aligned}$$

In particular, if A is a cone, then

$$\begin{aligned} f(a)\ \leqq \ 0,\ \forall \ a \in A \end{aligned}$$

and

$$\begin{aligned} f(b)\ >\ 0,\ \forall \ b \in B. \end{aligned}$$

Remark 5.3

For proofs of this version of the Hahn–Banach theorem see Theorem 21.12 in [4], Theorem 1.7 in [2], or Corollary 14.4 in [18]. The last sentence of Theorem 5.2 is not usually stated explicitly, but it is an easy consequence of the preceding one.

We are now in a position to prove Theorem 5.1, the central theorem of this article.

Proof of Theorem 5.1

To prove that (i) implies (ii) we first note for the Kelvin–Planck theory \((\Sigma ,\mathscr {P})\) that, in the Hausdorff locally convex topological vector space \(\mathscr {V}(\Sigma )\), the closed convex cone \({\hat{\mathscr {P}}}\) is disjoint from the convex compact set \((0,\mathscr {M}_{+}^1(\Sigma ))\). From the Hahn–Banach theorem, then, there is a continuous linear function \(f: \mathscr {V}(\Sigma )\rightarrow \mathbb {R}\) such that

$$\begin{aligned} f\,(\Delta \mathcal {m},\mathcal {q})\,\leqq \,0, \quad \forall \,(\Delta \mathcal {m},\mathcal {q})\,\in \,{\hat{\mathscr {P}}} \end{aligned}$$
(5.4)

and

$$\begin{aligned} f(0,\mathcal {w})\, > \, 0, \quad \forall \,(0,\mathcal {w})\,\in \,(0,\mathscr {M}_{+}^1(\Sigma )). \end{aligned}$$
(5.5)

Moreover, there are functions \(\eta \,(\cdot )\) and \(\beta \,(\cdot )\) in \(\text {C}(\Sigma ,\mathbb {R})\) such that \(f(\cdot ,\cdot )\) has the representationFootnote 15

$$\begin{aligned} f(\mathcal {v},\mathcal {w}) = \int _{\Sigma }(-\eta )\,d\mathcal {v}+ \int _{\Sigma }\beta \,d\mathcal {w},\quad \forall (\mathcal {v},\mathcal {w}) \in \mathscr {V}(\Sigma ). \end{aligned}$$
(5.6)

Note that for each \(\sigma \in \Sigma \) the Dirac measure \(\delta _{\sigma }\) is a member of \(\mathscr {M}_{+}^1(\Sigma )\). From this, (5.5), and (5.6) it follows that \(\beta (\cdot )\) takes strictly positive values. Letting \(T(\cdot ) = 1/\beta (\cdot )\), we get (5.1) as a consequence of (5.4) and (5.6). This completes proof that (i) implies (ii).

To prove that (ii) implies (i) we first observe that if the inequality (5.1) is satisfied for a particular \((\Delta \mathcal {m},\mathcal {q})\in \mathscr {P}\), then the inequality is also satisfied by \(\alpha (\Delta \mathcal {m},\mathcal {q})\) for every non-negative number \(\alpha \). For this reason, (ii) implies that the inequality

$$\begin{aligned} \int _{\Sigma }\eta \, d(\mathcal {v})\ \geqq \ \int _{\Sigma }\frac{d\mathcal {w}}{T} \end{aligned}$$
(5.7)

is satisfied for all \((\mathcal {v},\mathcal {w})\) in \(\textrm{Cone} \,(\mathscr {P})\) and therefore for all \((\mathcal {v},\mathcal {w})\) in \({\hat{\mathscr {P}}}:= \text {cl}\,(\textrm{Cone} \,(\mathscr {P})]\). To show that \((\Sigma ,\mathscr {P})\) is a Kelvin–Planck theory we must show that \({\hat{\mathscr {P}}}\) can contain no member of the form \((0,\mathcal {w})\), where \(\mathcal {w}\) is a nonzero member of \(\mathscr {M}_{+}(\Sigma )\). Because \(T(\cdot )\) is positive-valued, such an element could not satisfy (5.7). This completes the proof of Theorem 5.1. \(\square \)

Remark 5.4

(Interpretation of (ii)) In Theorem 5.1 (ii) we will, of course, regard (5.1) to be an expression of the Clausius–Duhem inequality, with \(\eta (\cdot )\) and \(T(\cdot )\) playing the roles of specific-entropy (entropy per mass) and thermodynamic temperature functions of state that assign to each \(\sigma \in \Sigma \) a specific-entropy \(\eta (\sigma )\) and a value \(T(\sigma )\) of the thermodynamic temperature.

If, for a physical process, \(\mathcal {m}_i\) and \(\mathcal {m}_f\) are the initial and final conditions of the body suffering the process then, with \(\Delta \mathcal {m}=\mathcal {m}_f - \mathcal {m}_i\), we have

$$\begin{aligned} \int _{\Sigma }\eta \, d(\Delta \mathcal {m}) = \int _{\Sigma }\eta \, d \mathcal {m}_f -\int _{\Sigma }\eta \, d \mathcal {m}_i. \end{aligned}$$
(5.8)

In view of (5.8) we can interpret the integral on the left side of (5.1) to be the difference in the entropy of the body suffering the process between the end of the process and its beginning.

In this sense, Theorem 5.1 tells us that for any Kelvin–Planck theory, there is a notion of the entropy of a body (along with a thermodynamic temperature scale) that aligns with the Gibbs version (1.1) of the Clausius–Duhem inequality with which we began. Note, however, that Theorem 5.1does much more, for it provides, in the spirit of modern classical physics, a local notion of specific entropy (entropy per mass), as a function of the local state within a body.

If a particular process \((\Delta \mathcal {m},\mathcal {q})\) derives from the data specified in the example of Sect. 3.2.3, the inequality in (ii) can be pulled back to a more traditional description of the Clausius–Duhem inequality, in effect an elaboration of the Gibbs version (1.1) suited to modern continuum physics:

$$\begin{aligned} \int _{\mathscr {B}}\eta \,({\hat{\sigma }}_f(X))\,d\mu (X) - \int _{\mathscr {B}}\eta \,({\hat{\sigma }}_i(X))\,d\mu (X) \geqq \int _{\mathscr {B}\times \mathscr {I}}\frac{d\,h(X,t)}{T({\hat{\sigma }}(X,t))}. \end{aligned}$$
(5.9)

Connections of entropy (with existence derived via [10]) to the theory of partial differential equations (in particular the canonical equations of continuum physics) are discussed by Evans [6].

In preparation for our concluding remarks and for the companion article [11], we record the following definition:

Definition 5.5

(Entropy, Thermodynamic Temperature) Let \((\Sigma ,\mathscr {P})\) be a Kelvin–Planck theory. An element \((\eta ,T)\) of \(\text {C}(\Sigma ,\mathbb {R}) \times \text {C}(\Sigma ,\mathbb {R}_+)\) that satisfies (5.1) is a Clausius–Duhem pair for the theory. A function \(T \in \text {C}(\Sigma ,\mathbb {R}_+)\) is a Clausius–Duhem temperature scale for the theory if there exists \(\eta \in \text {C}(\Sigma ,\mathbb {R})\) such that \((\eta ,T)\) is a Clausius–Duhem pair. In that case, \(\eta (\cdot )\) is a specific-entropy function for the theory (corresponding to the Clausius–Duhem temperature scale \(T(\cdot ))\).

Remark 5.6

(Differentiability of the specific-entropy function and the thermodynamic temperature scale) In applications of the Clausius–Duhem inequality, differentiability of the entropy and temperature with respect to state descriptors often plays a role. Here we focused solely on continuity of these functions. When, for a thermodynamical theory \((\Sigma ,\mathscr {P})\), the state space \(\Sigma \) is such that differentiability of real-valued functions on \(\Sigma \) has meaning, Theorem 5.1 remains true with \(\text {C}(\Sigma ,\mathbb {R})\) replaced by \(\text {C}^{\,k}(\Sigma ,\mathbb {R})\), so long as the same replacement is made in the definition of the topology on \(\mathscr {M}(\Sigma )\), given in footnote 14. That revised topology, which is coarser than the weak-star topology, exerts itself in the definition of \({\hat{\mathscr {P}}}:= \text {cl}\,(\textrm{Cone} \,(\mathscr {P}))\). This is discussed more fully, but in a narrower context, in Remark 10.2 of [9]. Similar considerations apply to the theorems of the companion article [11].

6 Concluding Remarks

In any thermodynamical theory that complies with the Kelvin–Planck Second Law, as expressed by (4.9), Theorem 5.1 asserts that there are invariably specific-entropy and thermodynamic-temperature functions (of the local material state) that together satisfy the Clausius–Duhem condition (5.1). Moreover, the two conditions are equivalent, so any theory for which there is a Clausius–Duhem entropy-temperature pair must comply with the form of the Kelvin–Planck Second Law given by (4.9).

Again, the proof that (i) implies (ii) is immediate. It relies only on the Hahn–Banach Theorem and functional analysis infrastructure unavailable to the brilliant founders of classical thermodynamics. It is worth emphasizing again that, with respect to the existence of Clausius–Duhem entropy-temperature pairs, there is no reliance on reversible processes or notions of thermodynamic equilibrium. There is no requirement that the set of processes contain certain ones of a specified kind. To some extent this will change in the companion article [11], where we consider properties (including uniqueness) of specific-entropy and thermodynamic-temperature functions of state, in particular the relation of those properties to the supply of processes.