On the foundations of statistical mechanics

Abstract
We briefly review the foundations and applications of statistical mechanics based on the nonadditive entropies Sq. Then we address four frequently focused points, namely (i) On the form of the constraints within a variational entropy principle; (ii) Are the q-indices first-principle-computable quantities or fitting parameters?; (iii) If one admits violation of the entropic additivity, why not admitting also violation of the entropic extensivity?; and (iv) Critical-like behavior.



Introduction
The basic goal of statistical mechanics is to start from appropriate microscopic laws (classical, relativistic, quantum mechanics, chromodynamics) and, by adequately using probability theory, to ultimately arrive to the thermodynamical relations and laws. Along these connections between the macro-and micro-worlds, a most relevant link is made through the fundamental concept of entropy. This discovery, accomplished against a stream of criticism, surely is one of the most powerful and fruitful breakthroughs of the history of physical sciences. It was achieved by Boltzmann in the last three decades of the nineteenth century. His main result, currently known by every pure and applied scientist, and carved on his tombstone in Vienna, namely, is the mathematical link between the microscopically fine description (represented by W , the total number of accessible microscopic states of the system) and the macroscopic measurable quantities (directly related to the entropy S BG , the very same quantity generically introduced by Clausius in order to complete thermodynamics!). Apparently, equation (1) has been explicitly stated in this form for the first time by Planck, but it was definitively known by Boltzmann. The index G stands for Gibbs, who put Boltzmann's ideas forward and overspread the (classical) statistical mechanical concepts through his seminal book [1]. Equation (1) is a particular 1434 The European Physical Journal Special Topics instance of a more general one, namely (for systems with discrete configurations) When every microstate is equally probable, i.e., when p i = 1/W , ∀ i, we recover equation (1). Evidently quantum mechanics was unknown to Boltzmann and it was just birthing when Gibbs' book was published. It was left to von Neumann to extend equation (2) in order to encompass quantum systems. He showed that the entropy for a quantum system should be expressed by using the density matrix operator ρ, sometimes referred to as the Boltzmann-Gibbs-von Neumann entropy (or just von Neumann entropy). Notice indeed that the above equation recovers equation (2) when ρ is diagonal. The optimization of the entropy with appropriate constraints provides the thermal equilibrium distribution, namely, for the canonical ensemble, the celebrated BG exponential distribution, whose consequences are consistent with classical thermodynamics. In what follows we shall, however, see that entropic functionals different from the BG one must be used in order to satisfy thermodynamics for complex systems which strongly violate the probabilistic independence (or quasi-independence) working hypothesis on which the BG entropy is generically based. The failure of this simple hypothesis is typically the case whenever there is breakdown of ergodicity (or a marginal dynamical behavior emerging for vanishing maximal Liapunov exponent). Several dozens of non-BG entropic functionals have been studied along quite a few decades. We focus here on the following one (introduced in [3] with the aim to generalize the BG statistical mechanics): where q ∈ R, and ln q z ≡ z 1−q −1 1−q (ln 1 z = ln z). We straightforwardly verify that lim q→1 S q = S BG . The inverse of the q-logarithmic function ln q z is the q-exponential function e z q ≡ [1 + (1 − q)z] 1 1−q , if 1 + (1 − q)z > 0, and zero otherwise (e z 1 = e z ). The entropy S q shares with S BG various important properties such as concavity, Lesche-stability, trace-form, and composability. They differ however in what concerns additivity, as we comment in what follows.
An entropic functional S is said additive if it satisfies [4], for any two probabilistically independent systems A and B, that S(A + B) = S(A) + S(B); otherwise it is said nonadditive. We easily verify that Therefore S BG is additive, and S q (with q = 1) is nonadditive. The generalization of the BG thermostatistical theory is currently referred to as nonextensive statistical mechanics (see [3,[5][6][7][8], which includes recent mini-reviews on which the present one is based; see also [9] for a regularly updated Bibliography). The entropy S q satisfies several interesting properties which deserve mention; among them, the uniqueness theorems proved by Santos [10] and by Abe [11], as well as the connection [12] with the Einstein likelihood factorization principle deserve a special mention.
To conclude, some frequently asked questions related to conceptually delicate points are briefly addressed in what follows.
2 Some remarks on frequently focused points

On the form of the constraints within a variational entropy principle
Within information theory, an entropic variational principle which is expected to determine the most plausible distribution of probabilities, noted p(x), must use robust and experimentally accessible constraints such as the position of the center of the distribution and its width. When the distribution decays quickly enough (e.g., exponential or Gaussian decays with random variables such as energy, velocities, positions), these quantities are conveniently identified with the mean value and the variance. If, however, the distribution decays slowly (e.g., a power-law decay), those quantities diverge and cannot be used within a mathematically well-posed problem. For example, if the distribution is a one-dimensional q-exponential ∝ e −x/ξ q (ξ > 0), its mean value diverges if q ≥ 3/2, whereas its norm is well defined up to q = 2. If it is a one-dimensional centered q-Gaussian ∝ e −x 2 /σ 2 q , its variance diverges for q ≥ 5/3, whereas its norm is well defined up to q = 3. The problem is satisfactorily solved if, instead of using the original distribution p(x) to calculate the mean values, variances, and similar moments, we use the escort distributions ∝ [p(x)] κ(q) (with κ(1) = 1). The functions κ(q) to be used in each case are analytically established in [143], and are numerically illustrated in [144].
It is occasionally counter-argued that experimental mean values and variances never diverge, which is undoubtedly true. However, when the distribution decays quickly enough, those quantities remain practically the same for experiments done with more data and consequently better statistics. It is by no means so when the distribution decays slowly enough. Indeed, in such cases the standard moments do not stop increasing when more and more data become available, which exhibit their essential mathematical inadequacy within a variational principle. In contrast, the qmean values and q-variances do remain practically the same for more and more data.
Let us finally mention that many calculations in literature do use standard moments for q = 1. This procedure can in fact be correct if the value of q is such that the specific moments that are used in the theory are finite. This and related equivalences have been discussed in [145].

Are the q-indices first-principle-computable quantities or fitting parameters?
The q-indices definitively are quantities that are to be obtained, whenever mathematical tractability is achieved, from first principles, more precisely from the microscopic (nonlinear) dynamics of the system, e.g., from the Hamiltonian of the system, or typically from the universality class of the Hamiltonian, characterized by the central charge c (this is the usual index emerging in conformal quantum-field theories; see, for instance, [121]). This is illustrated in Figure 1, and also in [122]. To be more precise, what is being focused on in this figure is the following. We are considering a crystalline N -particle one-dimensional quantum system at its vanishing temperature critical point (with N → ∞). This state (which is nondegenerate if we assume that symmetry has been broken) is a fundamental one, i.e., it is a pure state. Consequently, the entropy (the BG-von Neumann entropy or any other admissible one) of the entire system vanishes. We focus now on a subsystem of L elements. Because of the strong quantum entanglement, the subsystem of L elements constitutes a statistical mixture, hence its entropy S q (L) is different from zero for any value of the index q. It has been shown [119] that a value of q exists such that S q (L) ∝ L, this is to say that S q is extensive. This special value of q is depicted in Figure 1.
It is clear that full mathematical tractability, like in the example just above, is very rarely the case. Then, by fitting empirical data with adequate functions (q-exponentials and q-Gaussians in many examples) we obtain, within acceptable error bars, the values of q. The whole procedure is fully analogous to determining the orbit of say Mars from first principles, within Newtonian mechanics. The would be in principle possible if we had, at some initial time, the locations and velocities of all the masses of the planetary system, and an ideally huge computer to numerically solve the corresponding set of Newton's equations. In practice, what astronomers do is to fit their astronomical data by using the elliptic Keplerian generic form that straightforwardly comes out from Newtonian mechanics.

If one admits violation of the entropic additivity, why not admitting also violation of the entropic extensivity?
An analogous question can be posed in mechanics: If we admit, within relativistic mechanics, violation of the classical expression for the kinetic energy, why not admitting as well violation of the conservation of energy? The answer is well known: the particular forms of energy that are used in such or such phenomenon have no reasons to be universal, but the conservation of energy is a high-level principle, basically the first principle of thermodynamics, which is to be respected unless extremely serious reasons are undoubtedly verified in the future.
Similarly, the form of the entropic functional, together with its possible additivity, have no reasons to be universal. In contrast, the thermodynamic extensivity of the entropy constitutes a basic requirement of the Legendre-transform structure of thermodynamics. Indeed, this structure mandates quantities such as N itself, total entropy S, total volume V , total magnetization M , and similar ones, to be extensive in all circumstances: see in Figure 2 the typical scalings with size of all the thermodynamical variables that enter into the Legendre transformations. Other reasons are available in the literature (see, for instance, [138]) for the extensivity of the thermodynamic entropy, but their discussion is out of the present aim.  [146,147]. In the c → ∞ limit we recover the Boltzmann-Gibbs (BG) value, i.e., q = 1. For arbitrary value of c, the subsystem nonadditive entropy S q is thermodynamically extensive for, and only for, q = √ 9+c 2 −3 c (hence c = 6q 1−q 2 ; some special values: for c = 4 we have q = 1/2, and for c = 6 we have q = 2 √ 5+1 = 1 Φ where Φ is the golden mean). Let us emphasize that this anomalous value of q occurs only at precisely the zero-temperature second-order quantum critical point of (1 + 1)-dimensional systems; anywhere else than this critical point, the usual short-rangeinteraction BG behavior (i.e. q = 1) is valid. From [12]. For attractive long-range interactions (i.e., 0 ≤ α/d ≤ 1, α characterizing the interaction range in a potential with the form 1/r α ) we may distinguish three classes of thermodynamic variables, namely, those scaling with L θ , named pseudo-intensive (L is a characteristic linear length, θ is a system-dependent parameter), those scaling with L d+θ , the pseudo-extensive ones (the energies), and those scaling with L d (which are always extensive). For short-range interactions (i.e., α > d) we have θ = 0 and the energies recover their standard L d extensive scaling, falling in the same class of S, N , V , etc, whereas the previous pseudo-intensive variables become truly intensive ones (independent of L); this is the region, with two classes of variables, that is covered by the traditional textbooks of thermodynamics. From [7,148].  Fig. 3. The (q, βq) pairs that are represented here have been taken from the q-exponential cumulative probability function of the interarrival times during six cement-mortar loading cycles [153].

Critical-like behavior
typically focuses on complex phenomena of the power-law type, some sort of criticality could naturally be expected. Today this is indeed more and more evident. On one hand, several analytical connections between q and critical exponents are since long available in the literature (e.g., in [36,38,119]). On the other hand, more and more examples are emerging exhibiting a smooth monotonic dependence of the effective temperature T on some index q (i.e., an unique value of T for a given value of q), typically close to a linear relation of the type T = a − bq with a > 0 and b > 0. One such example was exhibited in [149], where it was established that 1/λ q = 1 − q, where λ q is defined as the q-generalized Liapunov coefficient which appears in the sensitivity to the initial conditions ξ ≡ lim Δx(0)→0 of a one-dimensional map, x being the dynamic variable (e.g., at the Feigenbaum point of the z-logistic dissipative map). Many other examples have been exhibited in the literature, e.g., in quark-gluon plasma [150], in the standard map [151], in (asymptotically) scale-free networks [152], acoustic emission analysis of cement mortar [153] (see Fig. 3), and in high-energy collisions [154].

On the q-central limit theorem
The classical Central Limit Theorem (CLT) plays a most important role in BG statistical mechanics. It basically states that the sum of a large number of (nearly) independent random variables with finite variance approaches, after centering and appropriate rescaling, a Gaussian distribution. Consequently, Gaussian distributions are expected to be very frequently observed in Nature and elsewhere. A notorious example is the Maxwellian distribution of velocities in any classical thermostatistical system.
For a variety of reasons that we do not detail here, a generalized form of this theorem was long expected in such a way that, due to strong correlations between the random variables that are being summed, the attracting distribution would be a q-Gaussian instead of Gaussian. Consistently, illustrations of such generalized theorem were being searched as well.
Two models were advanced [110,111] whose limiting distributions appeared numerically (within high precision) to be q-Gaussians with q < 1. However, they were analytically shown by Hilhorst and Schehr [155,156] to be not q-Gaussians. These analytical results turned out to numerically be, as Hilhorst and Schehr themselves showed, amazingly close to to q-Gaussians but definitively not q-Gaussians. These rather unexpected facts strongly stimulated the search of models whose limiting distributions could be proved to be q-Gaussians. This search achieved its analytical goal, not only for q < 1 (compact support) [157], but also for q > 1 (infinite support) [117].
On different grounds, some time later, Hilhorst did another type of criticism [158], which we briefly address now. Among the various attempts generalizing the CLT for strongly correlated random variables, a particular one was done in 2008 by introducing a q-generalized Fourier Transform (q-FT) [113,114]. Within this theorem (named q-CLT hereafter), it was implicitly used that the inverse q-FT is unique. Hilhorst exhibited in [158] a family of counterexamples, where an infinite set of functions have the same q-FT, therefore the q-FT inverse operation is not unique. This fact created a dangerous gap into the q-CLT. Efforts were then dedicated to find what supplementary information would make that inverse unique. This was successfully achieved through three different paths. The first path is described in [159], the second one in [160], and the third one in [161,162]. The next natural step would have of course been to introduce one of these paths within the q-CLT in order to fill the already mentioned gap. It happened, however, that the bothering gap was recently filled in a quite different manner. Indeed, it was proved [118] that, for q ≥ 1, the limit distribution is unique and can not have a compact support. It follows that it must consequently be a q-Gaussian. At the present stage, we may summarize these set of results by saying that it is rigorously illustrated now the fact that, for a possibly wide class of strongly correlated systems, q-Gaussian attractors are indeed expected to frequently emerge in natural, artificial and social systems.
It is a warm honor to dedicate this manuscript to the memory of my great friend and distinguished scientist Roger Maynard. He has given at least two crucial contributions in the subject of the present review. The first of them is that, along a four-hours tête-à-tête peripatetic discussion during the International Workshop on Nonlinear Phenomena held in Florianopolis-Brazil in December 1992, we realized, for the first time (and having in hands a preprint of the -at that time intriguing -paper by Plastino and Plastino [163] that was accepted for publication in Physics Letters A a couple of weeks later), that q-statistics ought to be generically relevant for long-range-interacting systems, a fact that has since then been profusely verified. His second important contribution is described in Refs. [62,63,63], which along time became a very cited one among Roger's papers. I acknowledge a fruitful discussion with D. Bagchi, and a critical reading by an anonymous Referee. I also acknowledge partial financial support by the organizers of the present event in Grenoble, as well as from CNPq and Faperj (Brazilian agencies) and the John Templeton Foundation (USA).