1 Introduction

The total thermodynamic entropy S, in equilibrium, must be Lorentz invariant. Every statistical mechanical view on thermodynamics agrees on this point. Whether we identify S with the Boltzmann entropy [1, 2], or with the Gibbs/Shannon entropy [3, 4], or with the von Neumann entropy [5, 6], its Lorentz invariance seems inescapable. This fact is also a foundational feature of relativistic fluid dynamics [7, 8] and of thermal quantum field theory [9].

Intuitively, the invariance of the entropy with respect to Lorentz transformations is usually justified by invoking its statistical connection with microscopic probabilities (or numbers of quantum states), which are supposed to have an invariant nature [10, 11]. However, when it comes to proving rigorously, from first principles, that the thermodynamic entropy (namely, the macroscopic state function which is subject to the second law) must necessarily be a scalar, some conceptual problems arise and it is easy to fall into circular reasoning.

The thermodynamic argument for the Lorentz invariance of the entropy that is often repeated in the literature [12, 13] is an oversimplified version of an argument originally proposed by [14]. Consider the following thought experiment: a body X is accelerated from being at rest with respect an observer A to being at rest with respect to an observer B (in motion with respect to A). If the process is adiabatic, it is reversible, hence the entropy of X measured by A is the same before and after the acceleration: \(S_{i}(A)=S_f(A)\). Now let’s assume that during this process the rest-frame properties of the body do not change (hence we may call this process a pure acceleration). It follows that the initial state, as seen by A, is identical to the final state, as seen by B, which implies \(S_i(A)=S_f(B)\) (recall that the entropy is a state function). Thus, A and B agree on the value of the entropy at the end of the process, \(S_f(A)=S_f(B)\), proving the Lorentz invariance of the entropy.

The problem with this argument is that what determines whether a process is reversible or not is the difference in entropy between the initial and the final state (if \(\delta S=0\), the process is reversible). Hence, assuming that pure accelerations are reversible is equivalent to assuming that the entropy does not depend on the velocity of the body, which is exactly what we are trying to prove. To the best of our knowledge, the first author who noted this circularity problem was van Kampen [15], who elevated the existence of reversible pure accelerations to the rank of fundamental postulate of relativistic thermodynamics. He showed that no entirely thermodynamic argument can be used to prove ab initio the Lorentz invariance of the entropy, but, to set the foundations of covariant thermodynamics rigorously (and to avoid any circularity issue), one only needs to postulate that pure accelerations are reversible.

The goal of the present paper is to explore the validity of van Kampen’s postulate in greater detail. In fact, from an operational point of view, the postulate can be rephrased as follows: adiabatic accelerations (i.e. slow variations of velocity generated by weak mechanical forces) do not alter the rest-frame properties of a body; in particular, they do not affect its rest mass. Given that this is a simple statement about the behaviour of many-particle systems subject to external forces, it should be possible to test it using relativistic dynamics and quantum field theory.

We remark that the purpose of this paper is not to convince the reader that the entropy is Lorentz invariant; this is already a well established fact [16]. Instead, the aim is to explain why this is the only possibility and to prove that any alternative construction of relativistic thermodynamics would lead to serious inconsistencies.

Throughout the paper we adopt the signature \((-,+,+,+)\) and work with natural units \(c=k_B=\hbar =1\). The study is performed within special relativity, hence the metric is assumed flat. Greek space-time indices \(\mu ,\nu ,\rho\) run from 0 to 3, while latin space indices jk run from 1 to 3.

2 The Rationale of the Argument

If we want to make our argument solid and unquestionable, we need, first, to understand which assumptions about relativistic thermodynamics we are reasonably allowed to uphold, and which might lead us to circular reasoning.

2.1 Must the Laws of Thermodynamics be the Same in Every Reference Frame?

It is possible to formulate many arguments for the Lorentz invariance of the entropy, based on the assumption that the laws of thermodynamics should be the same in every reference frame. A well-known example is Planck’s original argument (which is more refined than the version reported in the introduction), of which we present a slightly more formal version in appendix A. The rationale of Planck’s argument [14], and of most of the other thermodynamic arguments present in the literature, is that the entropy is ultimately a rule, which dictates which processes are possible (for thermally isolated systems) and which are not. For example, if a macroscopic state \(\psi\) has a lower entropy than a macroscopic state \(\psi '\), this means that, if we keep the system thermally isolated, the process \(\psi \rightarrow \psi '\) is possible, while the inverse process is not. Clearly, statements about the possibility for a process to occur cannot depend on the reference frame, hence the entropy must be Lorentz invariant.

The problem with these arguments is that they all treat thermodynamics as a fundamental theory, which should be subject to the principle of relativity in the same way as dynamics is, and whose laws should, therefore, be equally valid in every reference frame. In other words, it is assumed in these arguments that thermodynamics should share the same symmetries of dynamics. However, we already know that there is at least one symmetry for which this is not true: CPT. While CPT is a fundamental symmetry in quantum field theory [17], it is manifestly violated by the second law of thermodynamics. This shows us that we are in general not allowed to treat thermodynamics on the same footing as dynamics.

The fundamental distinction between dynamics and thermodynamics is that dynamics studies the evolution of systems with arbitrary initial conditions, which implies that the solutions of the equations which govern dynamics form a set \(\varLambda\) that is necessarily invariant under the action of the symmetry group \({\mathcal {G}}\) of the spacetime (\({\mathcal {G}}\varLambda = \varLambda\)). On the other hand, thermodynamics deals only with a subset \(\lambda \subset \varLambda\) of solutions, whose initial conditions have precisely those statistical properties (e.g. molecular chaos, see [18]) which give rise to the second law as an emergent quality. It might be the case (and for CPT it is the case!) that these constraints on the initial conditions lead to a symmetry breaking, namely to a situation in which \({\mathcal {G}} \lambda \ne \lambda\). Considering that specifying the laws of thermodynamics is essentially equivalent to specifying \(\lambda\), it follows that thermodynamics might in turn not be symmetric under \({\mathcal {G}}\).

Let us remark that we are not claiming that the laws of thermodynamics are not Lorentz covariant. They are. But (as we will show in subsection 2.3) their covariance follows from the invariance of the entropy, and not vice-versa. Thus, in a paper whose goal is to prove the invariance of the entropy, we are not allowed to include the assumption that thermodynamics is the same in every reference frame among the hypotheses.

As a last comment on this issue, we point out that, if one adopts Jaynes’ statistical justification for the second law [3], then the initial conditions that give rise to \(\lambda\) are actually the overwhelming majority of initial conditions which are compatible with the initial macroscopic data (in the thermodynamic limit). Hence, it is to be expected that, if the group \({\mathcal {G}}\) conserves the causal ordering of the events (namely if it does not convert initial states into final states), then \(\lambda\) should be approximately invariant under \({\mathcal {G}}\). This would explain why thermodynamics is not invariant under CPT (namely \(\text {CPT} \lambda \ne \lambda\)) while it is expected to be invariant under the proper orthochrounus Lorentz group (\(SO^+(3,1) \lambda = \lambda\)). In fact, CPT converts initial data into final data, whereas \(SO^+(3,1)\) conserves the causal structure of the field equations by construction [19]. This is the actual statistical justification for the covariance of thermodynamics, because it is not grounded on the interpretation that one chooses to give to the entropy, but on the statistical origin of irreversibility, which constitutes the very foundation of thermodynamics. However, as this argument is qualitative, and thermodynamics does not entirely reduce to Jaynes’ view [20, 21], it is important to have also a more formal proof, which is the purpose of the present paper.

2.2 The Assumptions of the Argument

Motivated by the complication outlined in the previous subsection, we need to make an argument for the Lorentz invariance of the entropy which does not build on the assumption that the second law of thermodynamics is valid for every observer. Instead, we will base our argument only two uncontroversial assumptions, namely

  1. (i)

    There is a global inertial reference frame A in which it is possible to unambiguously define a notion of entropy S that obeys the second law: \({\dot{S}} \ge 0\). In this reference frame, bodies may interact with each other, accelerate, decelerate and be destroyed, but the total entropy of isolated systems can never decrease.

  2. (ii)

    The microscopic dynamics can be modelled using a field theory, governed by a Lorentz invariant Lagrangian density.

Assumption (i) is simply the requirement that there is at least one observer for which the laws of thermodynamics, in their standard “textbook” formulation, are valid. Assumption (ii) is the statement that, although thermodynamics might in principle not admit a covariant formulation, dynamics does. We are enforcing the principle of relativity on the underlying microscopic theory, rather than imposing it directly on thermodynamics.

Throughout the rest of this paper, we will always work in the reference frame A introduced in assumption (i), so that thermodynamics works as usual. In this way we will avoid any possible source of confusion.

2.3 Van Kampen’s Argument

Let us now briefly revisit van Kampen’s argument for the Lorentz invariance of the entropy [15].

We consider an isolated (freely moving) body in thermodynamic equilibrium with total four-momentum \(p^\nu\) and rest mass \(M=\sqrt{-p^\nu p_\nu }\). The entropy in equilibrium must be a function of the constants of motion of the body. To capture the essence of the problem, we assume for simplicity that the only relevant constants of motion are the components of the four-momentum,Footnote 1 so that

$$\begin{aligned} S=S(p^\nu ). \end{aligned}$$
(1)

At this stage, the function \(S(p^\nu )\) may be completely arbitrary, because (as we anticipated) we are not excluding a priori the possibility that thermodynamics may break Lorentz covariance. Similarly to what we did in the introduction, let us postulate that it is possible to make infinitesimal reversible pure accelerations, namely transformations \(\delta p^\nu\) such that

$$\begin{aligned} \delta S = \dfrac{\partial S}{\partial p^\nu } \delta p^\nu =0 \quad \quad \quad (\text {reversible}) \end{aligned}$$
(2)

and

$$\begin{aligned} \delta M = -\dfrac{p_\nu }{M} \delta p^\nu =0 \quad \quad \quad (\text {pure acceleration}). \end{aligned}$$
(3)

If these accelerations can have arbitrary direction (i.e. if those \(\delta p^\nu\) that satisfy (2) and (3) form a 3D plane), then it follows that there is a function T such that

$$\begin{aligned} dS = \dfrac{dM}{T}, \end{aligned}$$
(4)

which in turn implies

$$\begin{aligned} S=S(M). \end{aligned}$$
(5)

The fact that the entropy can be written as a function of a Lorentz scalar implies that, when we perturb the system, the second law of thermodynamics (\({\dot{S}} \ge 0\)) takes the form of a Lorentz-invariant statement:

$$\begin{aligned} \dfrac{{\dot{M}}}{T(M)} \ge 0. \end{aligned}$$
(6)

But this implies that the set \(\lambda\) of all the initial conditions which realise the second law is invariant under the action of the proper orthochronous Lorentz group (formally, \(SO^+(3,1) \, \lambda = \lambda\)), proving that thermodynamics admits a covariant formulation, in which S is a Lorentz scalar. This sets solid foundations for relativistic thermodynamics.

Our goal, now, is to prove that, if assumptions (i) and (ii), as stated in the previous subsection, are valid, a set of infinitesimal transformations that satisfy both (2) and (3) always exist (at least in principle), converting van Kampen’s postulate into a theorem.

3 Reversible Accelerations

Our first task is to understand how we may induce an ideal reversible acceleration on a body. Following [23], the most perfect form of reversible process is an adiabatic process, namely an infinitely slow transformation in which the system is kept thermally isolated. Such processes can be modelled, at the microscopic level, as transformations induced by a weak and slow time-dependence of the microscopic Hamiltonian. Our aim is to design an adiabatic transformation which can alter the state of motion of a relativistic body.

3.1 Small Kicks

Let \(\varphi _i\) be the microscopic fields of the body and \({\mathcal {L}}_{\text {Body}}(\varphi _i,\partial _\mu \varphi _i)\) the Lagrangian density governing the microscopic dynamics. Assume that we are able to generate and control an external potential \(\phi\) (a real scalar field, for simplicity), which interacts with the body through a small dimensionless coupling constant \(\epsilon\), so that the action takes the simple form

$$\begin{aligned} {\mathcal {I}}[\varphi _i] = \int \bigg [ {\mathcal {L}}_{\text {Body}}(\varphi _i,\partial _\mu \varphi _i) + \epsilon \, \phi \, G(\varphi _i) \bigg ] \, d^4 x \, , \end{aligned}$$
(7)

where \(G(\varphi _i)\) is an observable. The potential \(\phi\) is an assigned real function of the coordinates \(\phi (x^\nu )\). It is not a dynamical degree of freedom of the total system (“\(\, \text {body}+\phi \,\)”), but it plays the role of a source in the action \({\mathcal {I}}[\varphi _i]\), which breaks the Poincaré invariance of the theory. In a quantum description, the field \(\phi\) plays the role of a classical source [19]; it is not a quantum field. We model \(\phi\) in this way because we want to treat it as a purely mechanical and non-statistical entity (like any other source of thermodynamic work, see e.g. [22]), so its evolution must be completely known and cannot be affected by the statistical fluctuations of the dynamical fields \(\varphi _i\). In this sense, the potential \(\phi\) may be seen as an analogue of the perfectly reflecting walls of an adiabatic box: it carries no entropy. This implies that the body remains thermally isolated [23] and the second law of thermodynamics holds for the entropy of the body alone [3], also during its interaction with \(\phi\).

Assume that \(\phi =0\) for \(t\le 0\) (recall that we always work, for clarity, in the reference frame A in which we have a notion of entropy). The configuration of the system for \(t\le 0\) is the initial state of the body, which is assumed to be an equilibrium state, with four-momentum \(p^\nu\). At \(t= 0\) we switch on the external potential and we keep it active for a finite time \(\tau\), namely

$$\begin{aligned} \phi \ne 0 \quad \text { for} \quad t \in (0,\tau ). \end{aligned}$$
(8)

No assumption about the duration \(\tau\) of the process, nor about the exact space-time dependence of \(\phi (x^\nu )\), is made. We only require that there is at least a small region of space-time (between the times 0 and \(\tau\)) in which

$$\begin{aligned} (\partial _1 \phi )^2 + (\partial _2 \phi )^2 + (\partial _3 \phi )^2 >0 \, , \end{aligned}$$
(9)

so that we know that the action (7) is not invariant under space translations, breaking the Noether conservation of linear momentum of the body. At the end of the process (\(t=\tau\)), the four-momentum of the body has changed of a finite amount \(\delta p^\nu\). After some more time passes, the system can reach a new state of equilibrium, whose entropy is \(S(p^\nu + \delta p^\nu )\). The total variation of entropy experienced by the system during all this process (including the final relaxation to a new equilibrium) is the finite difference

$$\begin{aligned} \delta S = S(p^\nu +\delta p^\nu ) - S(p^\nu ). \end{aligned}$$
(10)

The aforementioned process may be interpreted as a small kick generated by an ideal mechanical device:

  • For \(t\le 0\) the body is completely isolated and in thermodynamic equilibrium. It moves freely across space-time, with initial mass \(M=\sqrt{-p^\nu p_\nu }\) and center-of-mass four-velocity \(u^\nu =p^\nu /M\). It is in the maximum entropy state possible (as measured in the frame A) compatible with this value of four-momentum.

  • For \(0< t < \tau\) the body interacts with a mechanical device with no microscopic degrees of freedom (zero entropy). The interaction is mediated by a potential \(\phi\), which is generated solely by the device (and therefore carries no entropy). Through this interaction, the body feels a force, which impresses on it a small kick, changing its total four-momentum by an amount \(\delta p^\nu\). This amount of energy and momentum is transferred through \(\phi\) to the device, which is however not explicitly modelled here.

  • For \(t \ge \tau\) the body is again completely isolated and has time to dissipate all the fluctuations and vibrations induced by the kick, to reach a new equilibrium.

Comparing this description with subsection 6.2 of our previous paper [22], one can see that the variation of four-momentum \(\delta p^\nu\) produced in a kick has the nature of pure work (using the terminology we introduced there: \(\delta p^\nu = \delta {\mathcal {W}}^\nu\)), because the external agent can be modelled as a purely mechanical entity. Hence, kicks are the simplest form of work-type energy-momentum transfers in relativistic thermodynamics.

3.2 Infinite Infinitesimal Kicks

The key insight which leads us to a notion of adiabatic acceleration is how the changes \(\delta p^\nu\) and \(\delta S\) scale with the strength of the coupling constant \(\epsilon\), in the limit in which \(\epsilon \rightarrow 0\). We take this limit at fixed initial state of the body (for \(t \le 0\)) and keep the function \(\phi (x^\nu )\) fixed.

Since \(\epsilon\) quantifies how strongly the system reacts to the presence of the external potential \(\phi\) (\(\epsilon\) is analogous to the coupling constant q in the electrostatic force \(\mathbf{F }=q\mathbf{E }\)), it is easy to see that, to the leading order in \(\epsilon\), we have the scaling

$$\begin{aligned} \delta p^\nu \sim \epsilon \, . \end{aligned}$$
(11)

However, the variation of the entropy scales differently. In fact, the second law implies \(\delta S(\epsilon ) \ge 0\) \(\forall \, \epsilon\). On the other hand, \(\epsilon\) may have arbitrary sign,Footnote 2 which implies that if we assume \(\delta S \sim \epsilon\) we get a contradiction with the second law. Thus, the leading order must be

$$\begin{aligned} \delta S \sim \epsilon ^2 \, , \end{aligned}$$
(12)

or higher (but even).

Now, consider a sequence of N kicks (\(N \rightarrow +\infty\)) with a coupling constant \(\epsilon = 1/N \rightarrow 0\). The total variation of the four-momentum (due to the whole sequence of kicks) is

$$\begin{aligned} (\delta p^\nu )_{\text {N kicks}} \sim N \times (\delta p^\nu )_{\text {1 kick}} \sim N \times \dfrac{1}{N} =1 \, , \end{aligned}$$
(13)

while the total variation of entropy is

$$\begin{aligned} (\delta S)_{\text {N kicks}} \sim N \times (\delta S)_{\text {1 kick}} \sim N \times \dfrac{1}{N^2} = \dfrac{1}{N} \, . \end{aligned}$$
(14)

This implies that, as the number of kicks goes to infinity and their intensity goes to zero, the resulting transformation is non-trivial (\(\delta p^\nu\) is finite) and reversible (\(\delta S = 0\)). Hence, we have just built a microscopic model for a reversible acceleration. As expected, it is infinitely slow (duration \(\ge \tau \times N \rightarrow +\infty\)), so we have rediscovered the well-established fact that adiabatic transfers of energy-momentum (i.e. infinitely slow processes in which \(\delta p^\nu = \delta {\mathcal {W}}^\nu\)) are reversible (see [24], section 6, for another example). Note also that the reversibility of this transformation has been justified using only condition (i), namely the second law of thermodynamics; no other property of the entropy has been invoked.

In order to show that this reversible process is a pure acceleration, which would prove van Kampen’s postulate, see equation (3), we only need to show from microphysics that necessarily

$$\begin{aligned} \delta M \sim \epsilon ^2 \, , \end{aligned}$$
(15)

as this would immediately imply that \((\delta M)_{\text {N kicks}} \sim 1/N \rightarrow 0\). The next two sections of the paper contain two alternative proofs of (15).

4 Variation of the Mass Induced by a Kick: Field Theory Approach

We derive equation (15) from a field theory point of view.

4.1 Classical Case

Let us define the tensor field

$$\begin{aligned} T{^\mu _\nu } = {\mathcal {L}}_{\text {Body}} \, \delta {^\mu _\nu } -\dfrac{\partial {\mathcal {L}}_{\text {Body}}}{\partial (\partial _\mu \varphi _i)} \, \partial _\nu \varphi _i \, , \end{aligned}$$
(16)

where we are applying Einstein’s summation convention also to the label i. Given that the Euler-Lagrange equations, computed from the action (7), are

$$\begin{aligned} \dfrac{\partial {\mathcal {L}}_{\text {Body}}}{\partial \varphi _i} - \partial _\mu \bigg ( \dfrac{\partial {\mathcal {L}}_{\text {Body}}}{\partial (\partial _\mu \varphi _i)} \bigg ) = -\epsilon \, \phi \, \dfrac{\partial G}{\partial \varphi _i} \, , \end{aligned}$$
(17)

one can easily show that \(T{^\mu _\nu }\) obeys the equation

$$\begin{aligned} \partial _\mu T{^\mu _\nu } = -\epsilon \, \phi \, \partial _\nu G. \end{aligned}$$
(18)

This implies that for \(t \le 0\) and \(t \ge \tau\), i.e. in those space-time regions in which \(\phi =0\), the tensor field \(T{^\mu _\nu }\) is conserved, namely \(\partial _\mu T{^\mu _\nu } =0\). Indeed, \(T{^\mu _\nu }\) is the Noether stress energy tensor associated with \({\mathcal {L}}_{\text {Body}}\) [17], therefore it can be used to define the four-momentum of the body before and after the kick, by means of the formulas

$$\begin{aligned} \begin{aligned}&p_\nu = \int T{^0 _\nu } \, d^3 x \quad \quad \quad \quad \quad \, \, \, \, \text {for }t \le 0 \, \text { (before the kick)} \\&p_\nu + \delta p_\nu = \int T{^0 _\nu } \, d^3 x \quad \quad \text {for }t \ge \tau \, \text { (after the kick)} \, . \\ \end{aligned} \end{aligned}$$
(19)

Recalling that the four-velocity of the center of mass is \(u^\nu =p^\nu /M\) and applying Gauss’ theorem to the spacetime region \({\mathcal {R}}=(0,\tau )\times {\mathbb {R}}^3\) (assuming that the body is finite, so that the fields are zero at infinity), one can use (18) to prove that

$$\begin{aligned} \delta M = -u^\nu \delta p_\nu = \epsilon \int _{{\mathcal {R}}} \phi \, u^\nu \partial _\nu G \, d^4 x \, . \end{aligned}$$
(20)

The second equality in equation (20) is exact, whereas the first is valid up to the first order in \(\epsilon\). In the limit of small \(\epsilon\), we may use linear response theory and model G as the sum

$$\begin{aligned} G=G_0 + \epsilon \, G_1 \, , \end{aligned}$$
(21)

where \(G_0\) is the value that the observable \(G(\varphi _i)\) would have (on the spacetime point under consideration) if no kick were impressed on the body, while \(\epsilon \, G_1\) describes the perturbation to G due to the kick. Let us focus on the function \(G_0(x^\nu )\). If no kick were impressed on the body, the body would remain in a state of thermodynamic equilibrium, and would be drifting rigidly with constant four-velocity \(u^\nu\) without experiencing any macroscopic deformation, because it would keep the equilibrium shape. This implies that statistically (i.e. once we average over the microscopic fluctuations) we must have

$$\begin{aligned} u^\nu \partial _\nu G_0=0. \end{aligned}$$
(22)

This formula can be justified with the qualitative argument above, but it can also be proved rigorously from condition (ii), see Appendix B. If we plug (21) into (20), we obtain

$$\begin{aligned} \delta M = -u^\nu \delta p_\nu = \epsilon ^2 \int _{{\mathcal {R}}} \phi \, u^\nu \partial _\nu G_1 \, d^4 x \sim \epsilon ^2 \, , \end{aligned}$$
(23)

which is what we wanted to prove (see equation (15) and recall that \(\epsilon ^2 = 1/N^2\)). In conclusion, van Kampen’s postulate is valid, the entropy is Lorentz invariant and thermodynamics admits a covariant generalization.

4.2 Quantum Case

The above calculations are essentially the same if we move to a quantum context. Equation (18) becomes an operatorial identity (in the Heisenberg picture), while (20) becomes a Kubo formula for the quantum statistical average \(-u_\nu \langle {\hat{p}}^\nu \rangle\). Equation (22) remains valid, if we interpret \(G_0\) as the quantum statistical average \(\langle {\hat{G}}\rangle _{\text {eq}}\), see Appendix B. No further assumption about the equilibrium density matrix needs to be invoked in the proof. For example, we do not need to assume it to be of Gibbs-like form [21], because this might point towards a von Neumann interpretation of the entropy, leading us back to circularity issues.

As a final comment, we remark that the Unruh effect [25] disappears in the limit in which the accelerations are adiabatic. In fact, with a simple order of magnitude estimate (see Appendix C.1), one can verify that

$$\begin{aligned} (\text {Unruh corrections}) \sim \dfrac{e^{-1/\epsilon }}{\epsilon } \, . \end{aligned}$$
(24)

This shows that the Unruh effect is non-perturbative in \(\epsilon\): it decays to zero faster than any finite power of \(\epsilon\).

5 Variation of the Mass Induced by a Kick: Quantum Mechanics Approach

The proof of (15) given above, using a field theory approach, makes the role of condition (ii) manifest. However, it somehow hides the physical meaning of our result. Why does a small kick conserve (to the first order) the mass of a system of particles in equilibrium, while accelerating it? Why must it be that

$$\begin{aligned} \delta (p^\nu p_\nu ) \sim \epsilon ^2 \quad \quad \text {whereas} \quad \quad \delta p^\nu \sim \epsilon \, ? \end{aligned}$$
(25)

In this section, we will show, with a simple quantum mechanical argument, that (25) is a consequence of the mathematical structure of the Poincaré group. The argument is rigorously formulated within relativistic quantum mechanics [26], while the connection with quantum field theory is somehow heuristic. This makes the argument that follows probably less conclusive than the one outlined in the previous section, but it gives a deeper insight into the dynamical origin of (25).

5.1 The Mass Spectrum of a Finite Body

For the total four-momentum \(p^\nu\) to be finite, the body must be of finite size. But a completely isolated finite body in thermodynamic equilibrium must be self-bounded [22], otherwise it would eventually break up into smaller pieces in relative motion. It is well-known from ordinary quantum mechanics that bound states of many particles have a discrete mass spectrum (as we see, for example, in nuclear and atomic physics). The intuition behind this fact is that the degrees of freedom of a many-body system decouple into center of mass degrees of freedom plus internal degrees of freedom. Since, in a bound state, the particles cannot escape the conglomerate,Footnote 3 the internal degrees of freedom (which describe essentially the relative positions between the particles) are bounded and, hence, have discrete energy eigenvalues. Recalling that the rest mass is the energy measured in the rest frame (i.e. it is the Hamiltonian of the internal degrees of freedom, see [26]), the discreteness of the mass eigenvalues follows.

Let us see the mathematical implications of the argument above. Given that the space-time translation operators \({\hat{p}}^\nu\), computed from the Lagrangian density \({\mathcal {L}}_{\text {Body}}\), commute with each other, we can take, as basis of the Hilbert space of the body, some states

$$\begin{aligned} |p^\nu ,a\rangle \, , \end{aligned}$$
(26)

satisfying the eigenvalue equations

$$\begin{aligned} {\hat{p}}^\nu |p^\nu ,a\rangle = p^\nu |p^\nu ,a\rangle \, . \end{aligned}$$
(27)

The additional quantum number a is arbitrary (it is used to break possible degeneracies) and can be taken discrete. The eigenvalues \(p^\nu\) must be continuous (they organise themselves into three-dimensional hyperboloids), due to the mathematical structure of the Poincaré group [17]. The square mass operator

$$\begin{aligned} {\hat{M}}^2 := -{\hat{p}}^\nu {\hat{p}}_\nu \end{aligned}$$
(28)

commutes with all the generators of the Poincaré group (computed from \({\mathcal {L}}_{\text {Body}}\)) and is diagonal on the basis (26), with eigenvalue equation

$$\begin{aligned} {\hat{M}}^2 |p^\nu ,a\rangle = m^2 |p^\nu ,a\rangle \quad \quad \quad m^2 =-p^\nu p_\nu \, . \end{aligned}$$
(29)

The scalar \(m>0\) can be interpreted as the mass of the state \(|p^\nu ,a\rangle\). Combining the fact that \(p^\nu\) is “3D-continuous”, with the fact that m and a are discrete, we can conclude that

$$\begin{aligned} \langle {\tilde{p}}^\nu , {\tilde{a}}|p^\nu ,a\rangle = 2 p^0 (2\pi )^3 \delta ^{(3)} ({\tilde{p}}^j - p^j) \, \delta _{{\tilde{m}} \, m} \, \delta _{{\tilde{a}} \, a} \, . \end{aligned}$$
(30)

The standard normalization factor \(2 p^0 (2\pi )^3\) guarantees that (30) is Lorentz-invariant [19].

Equation (30) is crucial for us, because it shows that we can build a normalisable (i.e. physical) state \(|\varPsi \rangle\) which is eigenvector of the mass operator, namely

$$\begin{aligned} {\hat{M}}|\varPsi \rangle = m |\varPsi \rangle . \end{aligned}$$
(31)

However, the same is not true for the individual components \({\hat{p}}^\nu\): the physical state \(|\varPsi \rangle\) must be a wavepacket, namely a continuous superposition of eigenstates of \({\hat{p}}^\nu\). As we are going to show, this is the central difference between \({\hat{p}}^\nu {\hat{p}}_\nu\) and \({\hat{p}}^\nu\), which is responsible for the different scalings of the corresponding perturbations.

5.2 Kicking Mass Eigenstates

Due to the presence of the term \(\epsilon \, \phi \, G\) in the action (7), the operators \({\hat{p}}^\nu\) (which are computed from \({\mathcal {L}}_{\text {Body}}\)) are not conserved during the kick. As a first step, let us compute the variation of the mass of the body, induced by a kick, when the initial state \(|\varPsi \rangle\) is an eigenvector of \({\hat{M}}\), satisfying the eigenvalue equation (31).

As \(\phi (x^\nu )\) is an assigned function of the coordinates, the evolution of the body is unitary (the final state is still a pure state); this is the definition of thermal isolation [23] or, equivalently, of no heat transfer [3]. Working in the Schrödinger picture, we may call \(|\varPsi _\epsilon (\tau )\rangle\) the state of the body at the time \(\tau\) (just after the perturbation has been switched off) as a function of the coupling constant \(\epsilon\), parameterizing the intesity of the kick. Clearly, for \(\epsilon =0\), the mass is conserved (no kick has occurred), so that we may write

$$\begin{aligned} \delta M(\epsilon ) = \dfrac{\langle \varPsi _\epsilon (\tau )|{\hat{M}}|\varPsi _\epsilon (\tau )\rangle }{\langle \varPsi _\epsilon (\tau )|\varPsi _\epsilon (\tau )\rangle } -\dfrac{\langle \varPsi _0(\tau )|{\hat{M}}|\varPsi _0(\tau )\rangle }{\langle \varPsi _0(\tau )|\varPsi _0(\tau )\rangle } \, . \end{aligned}$$
(32)

Expanding this function to the first order in \(\epsilon\) we obtain

$$\begin{aligned} \delta M(\epsilon ) = \epsilon \dfrac{d}{d\epsilon } \bigg ( \dfrac{\langle \varPsi _\epsilon (\tau )|{\hat{M}}|\varPsi _\epsilon (\tau )\rangle }{\langle \varPsi _\epsilon (\tau )|\varPsi _\epsilon (\tau )\rangle } \bigg ) \bigg |_{\epsilon =0} + {\mathcal {O}}(\epsilon ^2) \, . \end{aligned}$$
(33)

If we compute the derivative in \(\epsilon\) explicitly, we get

$$\begin{aligned} \delta M(\epsilon ) = \epsilon \dfrac{\langle \varPsi '|\varDelta \rangle +\langle \varDelta |\varPsi '\rangle }{\langle \varPsi _0(\tau )|\varPsi _0(\tau )\rangle } + {\mathcal {O}}(\epsilon ^2) \, , \end{aligned}$$
(34)

with

$$\begin{aligned} \begin{aligned}&|\varPsi '\rangle = \dfrac{d |\varPsi _\epsilon (\tau )\rangle }{d \epsilon } \bigg |_{\epsilon =0} \\&|\varDelta \rangle = {\hat{M}} |\varPsi _0(\tau )\rangle - \dfrac{\langle \varPsi _0(\tau )|{\hat{M}}|\varPsi _0(\tau )\rangle }{\langle \varPsi _0(\tau )|\varPsi _0(\tau )\rangle } |\varPsi _0(\tau )\rangle \, . \\ \end{aligned} \end{aligned}$$
(35)

The final step consists of realising that, if the initial state \(|\varPsi \rangle\) obeys equation (31), then

$$\begin{aligned} {\hat{M}}|\varPsi _0 (\tau )\rangle = m |\varPsi _0 (\tau )\rangle \, , \end{aligned}$$
(36)

because, when \(\epsilon =0\), the Hamiltonian is \({\hat{p}}^0\), which commutes with \({\hat{M}}\). Inserting (36) into the second equation of (35) we find \(|\varDelta \rangle =0\), which immediately implies

$$\begin{aligned} \delta M (\epsilon ) \sim \epsilon ^2 \, . \end{aligned}$$
(37)

It is interesting to note that this result does not depend on the details of the full Hamiltonian of the system, because the explicit formula for \(|\varPsi '\rangle\) is completely irrelevant. However, the assumption that \(|\varPsi \rangle\) is a mass eigenstate is crucial. If we repeat the calculations above, taking as initial state a superposition

$$\begin{aligned} \dfrac{ |m_1\rangle + |m_2\rangle }{\sqrt{2}} \, , \end{aligned}$$
(38)

\(|m_1\rangle\) and \(|m_2\rangle\) being two normalised mass eingentates, relative to two different eigenvalues \(m_1\) and \(m_2\), we now obtain (truncating to the first order in \(\epsilon\))

$$\begin{aligned} |\varDelta \rangle = \dfrac{m_1 -m_2}{2\sqrt{2}} \bigg (|m_{1}(\tau )\rangle -|m_{2}(\tau )\rangle \bigg ) \end{aligned}$$
(39)

which does not vanish. By analogy, it becomes immediately clear why, in a kick, one is always able to induce an acceleration: any physical state must be a superposition of eigenstates of \({\hat{p}}^\nu\), hence the variation of \(\langle \varPsi |{\hat{p}}^\nu |\varPsi \rangle\) is of order \(\epsilon\) for the same reason why the variation of \((\langle m_1|+ \langle m_2|) {\hat{M}} (|m_1\rangle +|m_2\rangle )\) is of order \(\epsilon\).

5.3 Kicking Thermal States

As we explained qualitatively in subsection 4.1 (and proved rigorously in Appendix B), a system that is in thermodynamic equilibrium has constant shape. Its internal structure is conserved over time and the only change that the system can experience is a rigid macroscopic motion. Given that \({\hat{M}}\) is the Hamiltonian of the internal degrees of freedom, it immediately follows that the density matrix of a macroscopic body in equilibrium satisfies the equation

$$\begin{aligned} \Big [ \, {\hat{\rho }}_{\text {eq}} \, , \, {\hat{M}} \, \Big ] =0 \, . \end{aligned}$$
(40)

It is not hard to show that this condition is essentially equivalent to equation (61) of Appendix B.Footnote 4

Equation (40) implies that there is an orthonormal set of mass eigenstates \(|\varPsi ^{(n)}\rangle\), with

$$\begin{aligned} {\hat{M}}|\varPsi ^{(n)}\rangle = m_n |\varPsi ^{(n)}\rangle \quad \quad \quad \langle \varPsi ^{({\tilde{n}})} | \varPsi ^{(n)}\rangle = \delta _{{\tilde{n}} \, n} \, , \end{aligned}$$
(41)

such that

$$\begin{aligned} {\hat{\rho }}_{\text {eq}} = \sum _n {\mathcal {P}}_n |\varPsi ^{(n)}\rangle \langle \varPsi ^{(n)}| \, \end{aligned}$$
(42)

with

$$\begin{aligned} {\mathcal {P}}_n >0 \quad \quad \quad \sum _n {\mathcal {P}}_n =1 \, . \end{aligned}$$
(43)

Taking this as initial state and recalling that the evolution is unitary, it follows that the average value of \({\hat{M}}\) at a time \(\tau\) (at the end of the kick) is

$$\begin{aligned} M_\tau = \sum _n {\mathcal {P}}_n \dfrac{\langle \varPsi ^{(n)}_\epsilon (\tau )| {\hat{M}}|\varPsi ^{(n)}_\epsilon (\tau )\rangle }{\langle \varPsi ^{(n)}_\epsilon (\tau ) |\varPsi ^{(n)}_\epsilon (\tau )\rangle } \, . \end{aligned}$$
(44)

Given that equation (37) applies to each contribution in the sum over n (because each state \(|\varPsi ^{(n)}_\epsilon (\tau )\rangle\) is the time-evolved of a mass eigenstate), it applies also to a body with density matrix \({\hat{\rho }}_{\text {eq}}\), completing our proof.

There is a final remark that we need to make. All our analysis was performed within the assumption that the system does not radiate particles as a result of the kick (particles can be created and destroyed inside the body, but no particle can abandon the body). This is an important assumption, because, if it happens that the system emits particles along the way, the calculations above remain valid, but the quantity M can no longer be interpreted as the mass of the body alone, but as the rest-frame energy of the total system (“\(\, \text {body}+\text {emitted particles} \,\)”) invalidating the assumptions that lead to (12). Luckily, one can easily prove (see Appendix C.2) that also the probability of stimulated emissions induced by a kick is of the order \(\epsilon ^2\) (and, therefore, vanishes for adiabatic accelerations).

6 Conclusions

We have proved that the equation of state of isolated moving bodies (including only the four-momentum among the relevant variables) is always \(S=S(M)\). Rather than showing this by arbitrarily postulating the Lorentz covariance of the laws of thermodynamics, we have focused on the dynamical consequence of assuming \(S=S(M)\). In fact, declaring that two macroscopic states \(\psi\) and \(\psi '\) have the same entropy is equivalent to stating that there must be an adiabatic transformation that leads from \(\psi\) to \(\psi '\) and vice-versa. Using tools from both classical and quantum field theory we have shown that, indeed, infinitely slow accelerations, generated by a time-dependence of the Hamiltonian, must conserve the rest mass of bodies initially in thermodynamic equilibrium, making \(S=S(M)\) the only equation of state possible.

This sets solid foundations for relativistic thermodynamics and, again, shows that the axiomatization proposed by [15] and [16] is the only one possible. Furthermore, this paper complements our previous study on the nature of the temperature [22], in that it clarifies further the meaning of the work four-vector \(\delta {\mathcal {W}}^\nu\). In the same way in which one may intuitively decompose the heat four-vector \(\delta {\mathcal {Q}}^\nu\) into time and space components as

$$\begin{aligned} \delta {\mathcal {Q}}^\nu = \left( {\begin{array}{c}\text { Heat }\\ \text { Friction }\end{array}}\right) \, , \end{aligned}$$
(45)

one may consider the analogous (non-rigorous but useful) pictorial decomposition of the work four-vector as

$$\begin{aligned} \delta {\mathcal {W}}^\nu = \left( {\begin{array}{c}\text { Work }\\ \text { Kick }\end{array}}\right) \, . \end{aligned}$$
(46)

In the same way in which a wall can exert work on a gas (changing its energy), a potential can induce a kick on a freely moving body (changing its momentum). Both these processes, if executed slowly enough, become reversible. The first becomes the standard pressure-volume (“PdV”) adiabatic work, while the second becomes a pure acceleration.