Becoming Large, Becoming Infinite: The Anatomy of Thermal Physics and Phase Transitions in Finite Systems

This paper presents an in-depth analysis of the anatomy of both thermodynamics and statistical mechanics, together with the relationships between their constituent parts. Based on this analysis, using the renormalization group and finite-size scaling, we give a definition of a large but finite system and argue that phase transitions are represented correctly, as incipient singularities in such systems. We describe the role of the thermodynamic limit. And we explore the implications of this picture of critical phenomena for the questions of reduction and emergence.


Introduction
Thermodynamics and statistical mechanics coexist in a collaborative relationship within the envelope of thermal physics. In many presentations of the subject, particularly in undergraduate texts, it is heuristically advantageous to intermingle the macroscopic concepts of thermodynamics with the micro-picture provided by statistical mechanics. And it is, of course, self-evident that statistical mechanics 1 needs the basic structure of thermodynamics with inter-theory connecting relationships defining the thermodynamic quantities like internal energy, temperature and entropy. On the other hand, there are some advantages, both aesthetic and mathematical, in producing an account of thermodynamics which makes no reference to the underlying microstructure of the system, as would seem to be one of the aims of (among others) the books of Giles (1964) and Buchdahl (1966) and the papers of Lieb and Yngvason. 2 For Buchdahl we have the first law implying the existence of the internal energy function U and Carathéodory's (1909) version of the second law yielding the entropy S and temperature T ; for Lieb and Yngvason three sets of axioms accomplish the same task. This, together with an account of the nature of adiabatic processes (as described, for example, in Buchdahl 1966, Chaps. 5 and 6;Lieb and Yngvason 1999, Sect. 2.1;Lavis 2018, Sect. 2.1.1) provides the basic framework into which the models of statistical mechanics are embedded.
This raises the question of how statistical mechanics and thermodynamics relate to each other. Attempts to answer this question run up against a problem. The neat labels 'statistical mechanics' and 'thermodynamics' mask the fact that neither theory is a monolithic bloc. Indeed, each has a complicated internal structure with several layers of different theoretical postulates and assumptions. So the question of how statistical mechanics and thermodynamics relate ought to be interpreted as the more complex question of (a) what the internal structure of each theory is and of (b) how the various parts of each theory relate to the various other parts of the other theory. The complexity of the internal structures of both theories, as well as the intricacy of their interrelations, seems to have been somewhat under appreciated in the philosophical literature on the subject, and so the first aim of this paper is to present an in-depth analysis of the anatomy of both theories and the connections between their parts. 3  Fig. 1 provides a schematic advance summary of the analysis that we develop in this paper. It sees statistical mechanics and thermodynamics as parallel developments, each decomposed into separate levels representing the stages of theorybased development in which features are added to the system. The cross-interactions between the levels in the two columns contain interventions integral to this development. On the left are the levels for thermodynamics, as described in detail in Sect. 2. These levels are related to each other by adopting special assumptions, beginning at the bottom with basic thermodynamic theory (labelled TD1). Adding the extensivity assumption to this theory takes us to the next level, the density representation of thermodynamics (labelled TD2). Augmenting TD2 with the notion of phase transitions and critical phenomena (PTCP) gives thermodynamics with PTCP (labelled TD3). Finally, supplementing TD3 with a version of the Kadanoff scaling hypothesis leads us to thermodynamics with scaling theory (labelled TD4).
The parallel development for statistical mechanics is represented on the right of Fig. 1, as described in detail in Sect. 3. The picture here is a little more complicated, involving, as we explain in our discussion, three different paths. At the bottom is the fundamental theory, which we here take to be Gibbsian statistical mechanics (labelled SM1). 4 Assuming that the systems to which the theory is applied are large leads us to the next layer, large statistical mechanical systems (labelled SM2). This marks a branching point in the structure of the theory: three different additions can be made to SM2, resulting in three different branches. Adding the thermodynamic limit to SM2 leads to the statistical mechanics of infinitely large systems (labelled SM3). Adding renormalization group techniques to SM2 leads to the renormalization group approach to statistical mechanics (labelled SM4). Finally, adding the analysis of phase transitions for finite systems 5 to SM2 leads to the statistical mechanics of finite-system phase transitions (labelled SM5).
It is our aim in this work to keep the developments of thermodynamics and statistical mechanics as separate as possible, in order to make visible the internal structure of each separate theory. However, as indicated above, on close examination it becomes evident that there are in fact some 'messages', both implicit and explicit, sent from statistical mechanics (FSM), that is to say from the microstructure, to thermodynamics, which provides the macrostructure. These are spelled out in FSM--1, FSM--2, FSM--3, FSM--4. In the other direction the connecting relationships from thermodynamics (FTD), labelled FTD--1, FTD--2, FTD--3, identify quantities in statistical mechanics with thermodynamic variables. As we shall see FSM--1 also plays a role in the connecting process and can be seen as in dialogue with FTD--3. The remaining interventions FSM--2, FSM--3, FSM--4, can be viewed as an aid to the clarification of a number of important issues. We discuss these links between elements of both theories in the appropriate places in Sects. 2 and 3.
Much of the recent interest in the relationship between thermodynamics and statistical mechanics has concentrated on PTCP. It is the second aim of this paper to revisit the issue of PTCP in the light of our analysis of the internal structure of the two theories and their interrelations. Doing so will lead us to some unexpected, and we think important, conclusions.
In the modern theory of critical phenomena, dating from the middle of the 1960s, 6 critical exponents, which classify the type of singular behaviour in the approach to a critical region, play an important role. In our development of thermodynamics in Sect. 2 scaling theory is the final destination with scaling laws relating these critical exponents. However, as already indicated and as described 4 We set aside Boltzmannian statistical mechanics. For discussion of the relation between Gibbsian and Boltzmannian statistical mechanics see Lavis (2005) and Frigg and Werndl (2019). 5 Where, as described in Sect. 4, phase transitions are defined in a way which avoids the involvement of singularities. 6 For an historic account see Domb (1985).
below, thermodynamics is a structured shell into which particular models are embedded, either by the assumption of a phenomenological form for the entropy function or from statistical mechanics. In the absence of such an embedding it is not possible to calculate values for critical exponents, nor to discuss universality. This is the idea (Kadanoff 1976) that all critical situations 7 can be divided into universality classes, characterized by the values of their critical exponents and differentiated by a small number of properties of which the most important are the (physical) dimension d of the system and the symmetry group of the order parameter. The first, but not the second, of these plays an important role in our discussions, 8 in particular in the case of the Ising model, which we shall use as an illustrative example throughout this work. This, the most well-known and thoroughly investigated model in the statistical mechanics of lattice systems, is briefly described in Appen. B. With the list of critical exponents given there for d := 2, d := 3 and d ≥ 4, it provides an example of the dependence of these exponents and hence the universality class on the dimension of the system. The dimension d is also of importance, in our discussion of scaling theory in Sect. 2.4, of finite-size scaling in Sect. 3.4.2 and of phenomenological renormalization in Sect. 3.4.3 (c).
These observations concerning universality classes together with the inter-theory connecting relationships FSM--2, FSM--3, FSM--4, provide the impetus to investigate, and clarify a number of important issues relating to PTCP. These are (not necessarily in the order in which they arise in the discussion): (i) Are infinite systems really necessary in thermodynamics or statistical mechanics and: (a) If so what for? (b) If they are, is this solely because extensivity is not exactly true in most cases in statistical mechanics? (c) Is the thermodynamic limit irrelevant to thermodynamics or has it already been implicitly applied? 9 (d) Is the thermodynamic limit in statistical mechanics necessary for the implementation of the procedures of the renormalization group? (e) Is there a meaningful way to represent PTCP in finite systems? (ii) Given that, in thermodynamics, critical behaviour involves discontinuities in densities and singularities in response functions, is this necessarily still the case in statistical mechanics? (iii) Are the ideas of enrichment and substantiation helpful in describing the relationship between thermodynamics and statistical mechanics? (iv) Where do reduction and emergence feature in the accounts of the relationship between thermodynamics and statistical mechanics?
As indicated, in the title of this work and by the progression between levels in the statistical mechanical column in Fig. 1, we will discuss these issues with a special focus on large systems and infinite systems. In particular we shall address the question as to where realism is to be found, in the study of large systems, because real systems are finite but large (in the sense that they typically have ∼ 10 23 constituents), or in the thermodynamic limit of an infinite system, because singular behaviour (in susceptibilities and compressibilities) is believed to be experimentally observed, and in theories this arises only in the thermodynamic limit. This broad categorization of large systems is refined in Sect. 4. The process of taking the thermodynamic limit is the determination of the asymptotic properties of a system as it becomes infinitely large. In general this will involve taking d limits in each of the linear dimensions of the system and such a d-dimensionally infinite system, which where appropriate we call a fully-infinite system, is implicitly the object of investigation by scaling theory in Sect. 2.4. 10 However, relevant to our discussions is the case of a partially-infinite system, where the limit is taken in only d < d dimensions. Here it is d rather than d which should count for the critical behaviour as the dimension of the system. The idea underlying our approach to PTCP is that reality lies with fully-finite systems (d = 0) and that the judgment as to whether the large system will show behaviour which in practical terms is indistinguishable from singular behaviour is based on comparing the behaviour of systems of ever increasing size to see whether their properties indicate convergence towards those of the infinite system. In principle, as described in Sect. 4 this limiting process is in all d dimensions. In practice, as we see in our discussion of d = 2 transfer matrix calculations in Sect. 3.3, it also has relevance to the case where one limit has already been taken and increasing size is in the remaining dimension.
Thus, as we have indicated, Sects. 2 and 3 trace the steps in our developments of thermodynamics and statistical mechanics with the inter-theory connections between them; with Sect. 3.5 addressing different proposed resolutions to the contradiction between the finiteness of real systems and the perceived necessity of phase transitions being portrayed as singularities in infinite systems. Sect. 3.6 discusses the proposal of Mainwood (2006) for representing the occurrence of phase transitions in finite systems. Using the account of finite-size scaling in Sect. 3.4.2 we propose in Sect. 4 our alternative quantitative account for phase transitions in finite large systems. Sect. 5 contains some after-thoughts on enrichment, substantiation, reduction and emergence and our conclusions are in Sect. 6.

From Classical Thermodynamics to Scaling Theory
Accounts of thermodynamics range from those designed for the practical needs of engineers to those which aim for a degree of mathematical rigour. However, all share some common features and assumptions some of which are at variance with the insights gained in statistical mechanics. As indicated above, we flag these differences in the form of messages from statistical mechanics (FSM--1 to FSM--4).

The Structure of Thermodynamics
All accounts of thermodynamics contain (in some form or another) the first law, which establishes the existence of the internal energy function U and the second law which establishes the existence of the entropy S and temperature T . Details are not necessary for the present discussion. The only thing we need to carry forward is the fundamental thermodynamic differential form. Given a thermodynamic system with: (i) One mechanic extensive/intensive 11 conjugate variable pair (X, ξ), where X could stand for the volume V or magnetic moment M with conjugate intensive variables, which in the case of V is the (negative) pressure −P and in the case of M is the magnetic field H; (ii) A (dimensionless) extensive variable N which counts the number of units of mass in the system with a conjugate (intensive) energy µ, called the chemical potential, carried by each unit of mass; 12 for a differential change in the space Ξ 0 of the variables (U, X, N ) the differential change in the entropy S satisfies 13 are couplings. It is clear that the couplings are intensive and dimensionless. That the variables (U, X, N ) ∈ Ξ 0 appear as differentials on the right of (2.1) should be understood as signifying that they are independent variables. This means that the system is thermally, mechanically and chemically isolated with U , X and N fixed by an experimenter. Legendre transformations can be used to replace U and X successively as independent variables by ζ 1 and ζ 2 . Firstly, with Helmholtz free energy The extensive variables U , X and N scale with the size of the system, intensive variables T , ξ and µ are invariant with respect to such scaling. 12 In most presentations of thermodynamics N is simply taken to be the number of particles in the system. Our usage is designed to avoid reference to the microstructure of the system and to allow N to have non-integer values. 13 At this point it is convenient: (i) To clarify the dimensionality of the thermodynamic variables. It is straightforward to show that, by scaling with respect to suitable constants, T and ξ can be made of the dimensions of energy (J := m 2 kg s −2 ) and U , S and X made dimensionless. In the case of U this is achieved by factoring out an energy constant ε > 0. This is the field-extensive variable representation of Lavis (2015, Sect. 1.1), where scaling for S and T is effected using Boltzmann's constant kB. The further change to the coupling-extensive variable representation is achieved by taking ratios of ε, ξ and µ with respect to T as shown in (2.2). (ii) To observe that the generalization to more than one mechanical variable pair is straightforward. (iii) To emphasise that this differential form should not be understood as some sort of equilibrium process in the space Ξ 0 (Lavis 2018).
we have so that the independent variables are (ζ 1 , X, N ) ∈ Ξ 1 . The system is in contact with a source of thermal energy at temperature T = ε/ζ 1 . Secondly, with Gibbs free energy so that the independent variables are (ζ 1 , ζ 2 , N ) ∈ Ξ 2 . The system is now, through ζ 2 , also in mechanical contact with its environment, be it a fluid system subject to a pressure P or a magnetic system subject to a field H. The couplings ζ 1 and ζ 2 are referred to as the thermal and field (or mechanical) couplings respectively. It is tempting to suppose that this process could be taken one step further, interchanging the roles of N and ζ 3 . However, it is not difficult to see that the Legendre transformation implementing this would involve a free energy Φ 3 which is constant and can thus without loss of generality be taken to be identically zero. A viable form of thermodynamics must retain (at least) one extensive variable (here we choose that to be N , although we could have used X) which registers the size of the system.
Observing that in thermodynamics the uncontrolled variables remain constant when the corresponding controlled variables are held constant, this is now the point for the first message from statistical mechanics: FSM-1 : Unlike in thermodynamics, extensive variables in statistical mechanics that are uncontrolled quantities fluctuate even when the corresponding controlled variables are kept constant. (In Ξ 1 the energy corresponding to the internal energy U fluctuates, and in Ξ 2 the variable corresponding to X, be it the volume or the magnetic moment, fluctuates. This is born out by experiment (MacDonald 2013). For fixed N let (U, X, N ) A → (U ′ , X ′ , N ) denote an adiabatic process. It can be shown (Lavis 2019), from Carathéodory's first version of the second law (Carathéodory 1909), 14 that thermodynamic systems are of four types according to whether the adiabatic process gives U ≤ U ′ or U ≥ U ′ and S ≤ S ′ or S ≥ S ′ corresponding, respectively, to the possibilities of the temperature and heat capacity being positive or negative. 15 Standard accounts of thermodynamics concentrate solely on the case where both internal energy and entropy increase, which is the situation where both temperature and heat capacity are positive. We shall restrict out attention to that case. 16

Extensivity and the Thermodynamic Limit
Departing from the formulation TD1 of the structure of thermodynamics we ascend the left-hand column in Fig. 1, where it is now useful to consider the embedding of particular models. In this context they are of two types, ones which posit a phenomenological equation of state and ones derived from some microstructure according to the procedures of statistical mechanics. Most examples in the first category, the perfect gas equation, the Weiss-field equation for ferromagnetism and the van der Waals equation 17 introduce the models in terms of an equation relating the mechanical variable pair (X, ξ) and N to the temperature. However, it is more consonant with our approach to begin with a defining relationship for the entropy surface S(U, X, N ), from which T , ξ and µ, or equivalently the couplings ζ 1 , ζ 2 and ζ 3 can be calculated using (2.1). Thus: for some constant c, 18 giving 19 (2.8) • For the van der Waals fluid (2.10) The entropy (2.7) is a concave function of (U, V ), but for (2.9) it is necessary to take the concave envelope. This is, of course, equivalent in the case of the van der Waals (1873) fluid and other phenomenological equations of state to the application of Maxwell's equal areas rule (Maxwell 1875), which avoid the inclusion of unstable states and leads to a first-order gas-liquid phase transition (see Sect. 2.3). It will be noted that, for both the perfect gas and van der Waals fluid with densities u := U/N and v := V /N , there exists an entropy density s satisfying It is a matter of dispute (see, for example, Lavis 2019, and references therein) whether statistical mechanical models support the existence of negative temperatures, and the experimental evidence is also questioned. The same is the case for negative heat capacities. 17 And a number of lesser known relationships like the Redlich-Kwong and Dieterici equations of state. 18 Which can be evaluated using statistical mechanics but whose value is unimportant here. 19 Remember that ζ 2 := −P/T . for all N > 0, which avoids any reference to the size N of the system. But, of course, these are rather special models and the question arises as to whether entropy, in general, when X replaces V and x := X/N replaces v, satisfies For this question the following result is important: is true.

⊓ ⊔
Equation (2.13) is the condition that S is an extensive function and it is easily shown from (2.3) and (2.5) that the free energies Φ 1 and Φ 2 are extensive functions if and only if the entropy is an extensive function. But, as pointed out by Menon and Callender (2013, Sect. 2) and show in Sect. 3.3, FSM-2 : The extensivity of entropy and of free energies assumed in thermodynamics is not exactly true for all systems in statistical mechanics, but is approximately true for large systems.
For entropy the thermodynamic limit in statistical mechanics, assuming it exists, 20 is given by (2.14) But for thermodynamics the corresponding formula is (2.12), without the need for the limiting process. Exact extensivity in thermodynamics can be regarded as unnecessary or trivially true.
Differentiating (2.13) with respect to λ, and substituting from (2.1) gives when λ is put equal to 1. From (2.1) and (2.15), which is a version of the Gibbs-Duhem relationship. In terms of densities (2.15) becomes and substituting into (2.1)-(2.6) Then, for free-energy densities φ 1 := Φ 1 /N and φ 2 := Φ 2 /N , (2.20) These are the fundamental size-free thermodynamic relationships in terms of density variables and density functions. They are exact in thermodynamics but approximately true only for large systems in statistical mechanics. The question of large systems and the thermodynamic limit in statistical mechanics is treated in Sects. 3.3, 3.5 and 4.

Thermodynamics with PTCP
Having arrived at a formulation of thermodynamics in terms of densities and couplings the modern theory of PTCP is largely concerned with an investigation and classification of the singular properties of systems (see e.g. Buckingham 1972). Specifically the singularities which could occur on the hypersurface of the entropy density, or the appropriate free-energy density, which defines the state of the system. However we should be forewarned that the account of statistical mechanics in Sect. 3 concludes that: FSM-3 : The association of PTCP with singularities in the entropy and freeenergy densities which is made in thermodynamics can be made in statistical mechanics only for infinite systems.
The association of PTCP with singularities in both TD3 and SM3 leads to a tendency for them to be mistakenly conflated. (We shall discuss this in more detail in relation to limit reduction in Sect. 5.1.) We now consider three thermodynamic spaces, Ξ 0 , Ξ 1 and Ξ 2 , which correspond respectively to the spaces Ξ 0 , Ξ 1 and Ξ 2 defined in Sect. 2.1 except that now densities replace extensive variables. In reverse order, since this is more heuristically transparent: (i) In the space Ξ 2 of the vector ζ ζ ζ := (ζ 1 , ζ 2 ) the free-energy density φ 2 (ζ 1 , ζ 2 ) is a surface with normal in the direction (1, −u, x) and phases are separated by lines of transitions. The simplest example is a line L ⋆ across which there is a discontinuity of the gradient ∇φ 2 = (u, −x); an isothermal section (ζ 1 constant) of this surface is shown in Fig. 2. The point ζ ζ ζ ⋆ : . L ⋆ can be regarded as representing the coexistence of two phases with different densities. As ζ ζ ζ is varied across L ⋆ through ζ ζ ζ ⋆ there is a first-order phase transition where the densities change discontinuously.  Fig. 3 A first-order transition showing as the linear section C ⋆ in an isothermal section of the φ 1 surface. Fig. 4 A first-order transition showing as a horizontal part C ⋆ of an isotherm of ζ 2 plotted against x together with the isotherm through the critical point C. As ζ 1 varies the ends of C ⋆ trace the boundary of the coexistence region (shaded). θ1 θ2 ζ ζ ζ c := (ζ 1c , ζ 2c ) The first-order transition (coexistence curve) ζ 2 = ζ ⋆ 2 (ζ 1 ) is represented by a broken line and the critical isochore, along which the density x takes its critical value x = xc by a dotted line. The directions of the axes of the two relevant scaling fields at the critical point, as described in Sect. 2.4,are shown. In the case of both fluid and magnetic systems a first-order transition will involve a discontinuity of the internal energy density u. In a fluid system there will be a discontinuity of (physical) density as the system changes between a liquid and a gas. In a magnetic system there will be a discontinuity in the magnetization (or equivalently the magnetization density) as shown for the Ising model in Fig. 9. (ii) In the space Ξ 1 of the vector (ζ 1 , x) the free-energy density φ 1 (ζ 1 , x) is a surface convex with respect to x with normal in the direction (1, −u, −ζ 2 ), as shown by an isothermal (ζ 1 = ζ ⋆ 1 ) section in Fig. 3. A first-order transition corresponds to the part of the isotherm, labelled C ⋆ , which is linear with respect to x. At the ends of (ζ ⋆ 1 , x (⋆+ ) and (ζ ⋆ 1 , x (⋆−) ) of C ⋆ all three couplings ζ 1 , ζ 2 and ζ 3 have the same values as is otherwise shown in Fig. 2. Typically, as ζ ⋆ 1 varies along L ⋆ the ends of C ⋆ converge to a critical point where the system exhibits a second-order transition. There the densities are continuous but one or more of the response functions (that is to say the curvature components of the free-energy surface) is singular. 21 A projection of the linear coexistence region in Fig. 3 is shown in Fig. 4, and the situation where the corresponding transition line L ⋆ terminates is shown in Fig. 5. (iii) The space Ξ 0 of the vector (u, x), in which the entropy density s(u, x) is a concave surface is similar to that for φ 1 (ζ 1 , x), 22 except that now the linear generator C ⋆ of the coexistence region has endpoints (u (⋆+) , x (⋆+ ) and (u (⋆−) , x (⋆−) ). As ζ ζ ζ ⋆ varies along L ⋆ , C ⋆ traces out the boundary of a ruled 23 region on the entropy surface with C ⋆ converging in one direction to the critical point described in (ii).
Critical exponents at the critical point are associated with the curvature of the coexistence curve in Ξ 1 and the coexistence line in Ξ 2 , and the asymptotic singular behaviour of the (per particle) heat capacities cx and c ξ at constant density and field respectively and a response function ϕT , which in a fluid corresponds to the compressibility and in a magnet to the susceptibility. It will also be useful to include the coefficient of thermal expansion α ξ . These are defined together with their critical exponents in Appen. A. The heat capacities cx and c ξ are normally positive and from (A.4) it follows that, if ϕT > 0, then c ξ dominates both cx and α 2 ξ /ϕT as T → Tc. For the critical exponents σ and σ ′ characterizing the singularity of cx on approach to the critical point from above and below Tc, and the analogously defined critical exponents α and α ′ characterizing the singularity of c ξ , and γ and γ ′ characterizing the singularity of ϕT , as well β characterizing the curvature of the coexistence curve, this means that The condition ϕT > 0 is true for a magnetic system and in this case the third inequality in (2.21) was first established by Rushbrooke (1963 was obtained by Griffiths (1965) for both magnetic and fluid systems using the convexity properties of the free energy. In fact it is a consequence of scaling theory (Sect. 2.4) that, for systems with a special symmetry which is present in magnetic systems where, as for the Ising model in Appen. B, the coexistence curve coincides with the zero field axis, σ ′ = α ′ and inequalities (2.21) and (2.22) become identical. Otherwise σ ′ = γ ′ . Griffiths (1965) also derived a number of other inequalities. In particular where δ, given by (A.8), is the exponent characterizing the (critical) equation of state.

Thermodynamics with Scaling Theory
In view of our aim to keep as distinct as possible the developments of thermodynamics and statistical mechanics, we choose here to present scaling theory as a mathematical axiomatization of the properties of PTCP in thermodynamics.
Although, as we see below, it has deep roots in, and is substantiated by, statistical mechanics, in particular renormalization group theory, 24 where, in almost all cases, 25 the realization of this picture of scaling involves approximations and yields scaling forms of only local validity.
Originating in the work of (among others) Widom (1964Widom ( , 1965 and Kadanoff (1966) our approach is essentially that of Hankey and Stanley (1972). Given here in brief outline 26 it is sufficient for an analysis of power-law singularities in the critical region. 27 Suppose we have the free-energy density of a system in terms of its maximum number of independent couplings. In the discussion above that maximum number was two, but for the moment we generalize to n couplings so the free-energy density is φn(ζ ζ ζ), where ζ ζ ζ := (ζ 1 , ζ 2 , . . . , ζn), which is represented as a hypersurface of dimension n in the (n + 1)-dimensional space (φn, ζ ζ ζ). Now suppose that there is a critical region C of dimension n − s. Although φn(ζ ζ ζ) itself is continuous and finite across and within C it may have discontinuous first-order derivatives, meaning that C is a region of phase coexistence with a first-order transition when, as is shown in Fig. 2, the phase point crosses through C, or it may have singular second-order derivatives in C, as is the case in the situation described above where a line of first-order transition terminates at a critical point. 28 With respect to some origin ζ ζ ζ • ∈ C a system of orthogonal curvilinear coordinates θ 1 , θ 2 , . . . , θn called scaling fields is constructed. These are smooth functions 24 The assumption (2.24) that, near to a critical region, the free-energy density can be divided into smooth and singular parts is justified in terms of the form of the Hamiltonian (Hankey and Stanley 1972, p. 3519). 25 An exception being the one-dimensional Ising model (see e.g. Lavis 2015, Sect. 15.5.1). 26 For a more detailed account see, for example, Lavis (2015, Chap. 4). 27 For statistical mechanical systems like the Ising model with d = 2 which exhibit logarithmic singularities, it has been shown by Nightingale and 'T Hooft (1974) that a slight generalization needs to be used. 28 Or there may be discontinuities or singularities in higher-order derivatives; but we shall for simplicity concentrate solely on cases involving first-and second-order derivatives.
of the couplings which parameterize C so that θ 1 = · · · = θs = 0 within C. The scaling fields in this subset are called relevant with those in the remaining subset θ s+1 , θ s+2 , . . . θn, called irrelevant, acting as a local set of coordinates within C. 29 The free-energy density φn(ζ ζ ζ) is separated into two parts φn(ζ ζ ζ) = φ smth (ζ ζ ζ) + φ sing (△ζ ζ ζ) , (2.24) where φ smth (ζ ζ ζ) is a regular function and, with △ζ ζ ζ := ζ ζ ζ − ζ ζ ζ • , φ sing (△ζ ζ ζ), for which φ sing (0 0 0) = 0, contains all the non-smooth parts of φn(ζ ζ ζ) in C. It is now assumed that φ sing (△ζ ζ ζ) can be re-coordinated in terms of the scaling fields so that it is a generalized homogeneous function satisfying the Kadanoff scaling hypothesis 30 φ sing (λ y1 θ 1 , . . . , λ yn θn) = λ d φ sing (θ 1 , . . . , θn) , (2.25) for all real λ > 0, where d is the physical dimension of the system, and y j , j = 1, 2, . . . , n are scaling exponents satisfying y j > 0, j = 1, . . . , s, y j < 0, j = s + 1, . . . , n. (2.26) The exponents in the first subset are, like the corresponding scaling fields, called relevant and those in the latter subset are called irrelevant. 31 Of the assumptions made here, that scaling fields can be derived is not particularly demanding; at the very least it is usually straightforward to obtain their linear parts near to the origin. And the division of the free-energy density (2.24) into smooth and singular parts has very little content until we explore in more detail the consequences of the scaling hypothesis (2.25) which we now do for the case of a critical point terminating a coexistence curve.
There are many general accounts of scaling theory, treating a variety of critical phenomena. Here we restrict attention to the case of a critical point terminating a line of first-order transitions, as shown in Fig. 5. So we have two critical regions. The first is the critical point with two relevant scaling fields and scaling exponents with axes chosen perpendicular to and along the coexistence curve. For this we shall show that the critical exponents defined in Appen. A, can be expressed in terms of the two scaling exponents. The second is the coexistence curve which has one relevant and one irrelevant scaling field constructed with respect to some chosen origin (not shown in Fig. 5) on the coexistence curve. For the sake of further simplifying our presentation we restrict attention to a simple ferromagnetic system with ξ := H, the magnetic field, X := M, the magnetization and x := m = M/N, the magnetization density. The coupling ζ 1 is 29 'Relevance' here refers to their role in an understanding of the nature of the criticality in C. 30 The physical dimension d of the system is not something which plays a significant role in most of thermodynamics. It is included here to bring compatibility with the discussion of statistical mechanics. It could be removed by redefining λ. 31 We have for the sake of simplicity excluded the possibility of a zero exponent; such an exponent is called marginal. Marginal exponents are associated in renormalization group theory with an 'underlying' parameter of the system, often resulting in lines of fixed points as we see in our treatment of the one-dimensional Ising model in Sect. 3.4.3. It will also be assumed that no exponent is complex. In practice this is not always the case (see e.g. Lavis 2015, Sect. 15.5.2), but situations arising from complex exponents are not difficult to interpret in particular examples.
the thermal coupling so we relabel it as ζT = ε/T and ζ 2 is the field coupling which we relabel as ζH = H/T . This model, of which an example in statistical mechanics is the Ising model described in Appen. B, has the advantage of having the special symmetry that the coexistence curve lies along the zero-field axis in an interval T ∈ [0, Tc] with Hc = mc = 0. This axis with T > Tc is the critical isochore. Thus (referring to Fig. 5) the coexistence curve lies along the ζH = 0 axis in an interval [ζTc, ∞). This same phase diagram for the Ising model, now plotted with respect to the temperature T and the magnetic field H, is shown in Fig. 8.
We consider separately the critical point and the coexistence curve, beginning with the critical point where we can take the scaling fields to be (2.27) The scaling hypothesis (2.25) becomes  However, the analysis can be shortened by a closer examination of the way that the expression (2.32) for β was obtained. From this we see that the scaling exponent yH in the numerator indicates that differentiation was once with respect to ζH. And that the approach was in the direction of varying ζT is indicated by the scaling exponent yT in the denominator. So with the same reasoning it follows from (A.8) that and bearing in mind that the analysis yields singularities for response functions so φ smth can play no role, from (A.7), When we come to consider c ξ := cH, given by (A.3), the situation becomes a little more complicated, since there are three terms and we need to know which dominates as the critical point is approached. This will depend on the relative magnitudes of yT and yH and it can be shown (Lavis,ibid,Sect. 4.5.1) that, in general for a critical point terminating a line of first-transitions, the exponent associated with approaches tangential to the coexistence curve is smaller (less relevant) than that associated with an approach at a non-zero angle to this curve. These are called respective weak and strong approaches and in the present context we have yH > yT , these being respectively the weak and strong exponents. Returning to the formula for cH in (A.3) we see that the third term on the right-hand side would be the one that dominates meaning that, from (A.9), σ = σ ′ = γ. However, because of the symmetry of the magnetic model ζ 2c := ζHc = 0 and the only remaining term is the first, meaning that Finally we need to determine the asymptotic form for cx := cm using (A.4). Here the situation need a more detailed analysis, when it can be shown (Lavis,ibid,Sect. 4.5.4) that, whether or not the magnetic symmetry applies cancellation of coefficients leads to an asymptotic form equivalent to that of a second-order derivative with respect to ζT ; that is, This means that it is the asymptotic form of the heat capacity with constant intensive variable (pressure or magnetic field) which is dependent on symmetry. In the magnetic system the exponent is the same as that of the heat capacity with constant extensive variable (the magnetization) and in a fluid, where there is no symmetry it is equal to that of ϕT , which is the compressibility. Equations (2.32)-(2.36) are formulae for the exponents α, β, γ and δ in terms of yT and yH. They are, therefore, not independent and two relationships exist between them. These can be expressed in the form α + 2 β +γ = 2, called the Essam-Fisher scaling law, which is a strengthening of the inequality (2.22) and γ ′ = β(δ − 1), called the Widom scaling law, which is a strengthening of the inequality (2.23). For the coexistence curve, scaling fields, chosen with respect to some arbitrary with y ′ T and y ′ H irrelevant and relevant exponents respectively. In general it can be shown that relevant exponents are less than or equal to d meaning in this case that 0 < y ′ H ≤ d. With primes attached to the exponents and fields (2.29) and (2.30) continue to applied to the magnetization density. If y ′ H < d ∂φ sing ∂θH (0, 0) = 0 (2.38) and m is continuous at the origin; there is no first-order phase transition. If y ′ H = d then (2.38) does not necessarily hold. There may be a contribution to (2.29) from the derivative of φ sing . This will be the only way in which the magnetization can be discontinuous across the coexistence curve. So a scaling exponent equal to d is a necessary, but not sufficient condition for a first-order transition. An example of such a first-order transition with an exponent of d is at zero temperature in the onedimension Ising model (Sect. 3.4.3(a)). Discontinuities in higher-order derivatives can be treated in a similar way.

Dimensionality and Phase Transitions
Although, as we have seen, thermodynamics, and particularly its treatment of PTCP, assumes that the system is infinite, the dimension d of the system entered into the discussion in Sect. 2.4. And once dimensionality has entered then finiteness has also appeared. Thus, for example, a two-dimensional system can be viewed as a three-dimensional system of 'thickness' one in the third dimension and it is only a small step from there to increase the thickness to two. In Sect. 1 we referred to the classification of singularities in terms of universality classes. This, as we asserted, can be discussed only in the context of statistical mechanics, with d one of the factors determining the universality class of an occurrence of singular behaviour. If the number of directions in which the system is infinite is increased, then its critical behaviour will change from one universality class to another. This is an example of what in scaling and renormalization group theory is called 'crossover'. 32 The dimension of the system affects not just the universality class of singular behaviour but whether it occurs at all. However, that dimension is not d but d ≤ d, the number of directions in which the system is infinite. 33 And the final message sent from statistical mechanics to thermodynamics is that: FSM-4 : There exists a lower-critical dimension dLC such that, if d ≤ dLC < d singular behaviour can occur in the fully-infinite system but not in the partiallyinfinite system. If d > d > dLC then singular behaviour can occur in both, but in different universality classes.

From Gibbsian Statistical Mechanics to the Renormalization Group
The move from thermodynamics to statistical mechanics is, we shall argue, an enrichment and substantiation of the picture we have of any system under investigation. This operates at two levels. The first is structural, where renormalization group theory embedded in statistical mechanics provides a fuller picture in terms of renormalization group transformations and fixed points than scaling theory embedded in thermodynamics. The second is in the provision of specific models which arise from assumptions about the microstructure of the system. We now consider the development represented by the right-hand column in Fig. 1, beginning with the basic structure of statistical mechanics. 32 Of course, such a change of universality class is counter-factual , in the sense that one cannot change the dimension or extensivity properties of a real system. 33 The connection between the thermodynamic limit and extensivity is retained in a partiallyinfinite system with N k sites in the k-direction and N 1 N 2 · · · N d = N , when, in the case, for example, of entropy, (2.13) is replaced by

Inter-Theory Connecting Relationships
Let the microstate of the system be given by a value of the vector variable σ σ σ in the phase space Γ . In the case of a fluid system σ σ σ will be a set of values for the positions and momenta of all the particles; for a spin system on a lattice, like the Ising model in Appen. B, σ σ σ will be the set of values of all the spin variables. The microscopic and macroscopic structure of the system is then determined by the Hamiltonian. This is an explicit function of the independent couplings with the independent extensive variables imposing constraints on σ σ σ. Thus we have three cases: (i) When (U, X, N ) ∈ Ξ 0 are the independent variables the Hamiltonian is H 0 (σ σ σ; X, N ), with values constrained by and σ σ σ constrained, according to the nature of the particular model by X and N . 34 (ii) When (ζ 1 , X, N ) ∈ Ξ 1 are the independent variables the Hamiltonian H 1 (σ σ σ; ζ 1 , X, N ) is a linear function of ζ 1 . The constraint (3.1) is removed but σ σ σ remains constrained by X and N . (iii) When (ζ 1 , ζ 2 , N ) ∈ Ξ 2 are the independent variables the Hamiltonian H 2 (σ σ σ; ζ 1 , ζ 2 , N ) is a linear function of ζ 1 and ζ 2 . The only remaining constraint is from N .
Connecting relationships are now invoked in three stages: FTD-1 : The independent variables in Ξ 0 , Ξ 1 and Ξ 2 are endowed with their thermodynamic meanings.
To proceed to the next stage of the inter-theory connecting process we need to give a form in cases (i), (ii) and (iii), respectively, for the entropy, and the free energies Φ 1 and Φ 2 . Case (i) gives the microcanonical distribution 35 and cases (ii) and (iii) give, respectively, the canonical distribution and the constant pressure or magnetic field distribution. For the sake of simplicity we concentrate exclusively on case (iii), where the Gibbs free energy is defined by

3)
34 N fixes the number of particles in σ σ σ. If the system is a fluid, with X := V the volume of it container, then this constrains the range of the configuration component of σ σ σ. Rather less physically achievable, if the system is a magnet, with X := M the magnetization, this will constrain the spin configuration of the microsystems. 35 Where there is still some dispute about the appropriate form for the entropy (see, for example, Lavis 2019, and references therein).
This completes a sufficient set of the connecting relationships. However, we can make some further links. Suppose that Then, from (3.2)-(3.5), FTD-3 : U and X are identified respectively as the expectation values of U (σ σ σ) and X(σ σ σ) with respect to the probability distribution with density (3.8) And it further follows from (3.2)-(3.8) that where ϕT is the response function given by (A.3). This is an example of a fluctuationresponse function relationship. Similar relationships apply to U (σ σ σ) and all uncontrolled extensive variables.

Correlation Function and Correlation Length
As is already evident, thermodynamics is a 'black-box' theory with a set of macrovariables some of which are independent and controllable and others whose values change in response to the changes in the independent variables. The only concession made to internal structure was, in Sect. 2.1, to allow a counting of the number N of mass units of the system. Now with the 'enrichment' provided by statistical mechanics we are able to record the microstate σ σ σ of the system, which is simply the aggregate of the states of the individual microsystems.
Suppose that we take the d-dimensional hypercubic lattice 37 N d with sites r r r := (n 1 , n 2 , . . . , n d )a, for n k = 1, 2, . . . , N k with N = N 1 N 2 · · · N d , where a is the lattice spacing. 38 Then, given that the states of the microsystems on sites r r r and r r r ′ of the lattice are σ(r r r) and σ(r r r ′ ), respectively, how does the state of one effect the state of the other; that is to say, how are their states correlated? More specifically, how is the correlation between σ(r r r) and σ(r r r ′ ) affected by: (i) the distance |r r r − r r r ′ | between the sites? (ii) the closeness of the thermodynamic state of the system to a critical region?
To begin to answer these questions suppose that, as for the Hamiltonian (3.6) in the Ising model in Appen. B, X(σ σ σ) is a linear sum of the states on the sites of N . And (temporarily) suppose that the coupling ζ 2 takes different values ζ 2 (r r r) at the sites. Then, denoting the set of couplings ζ 2 (r r r) by the vector ζ ζ ζ 2 , and from (3.8), the expectation values of σ(r r r) is . (3.11) If the states σ(r r r) and σ(r r r ′ ) are uncorrelated σ(r r r)σ(r r r ′ ) will factor into σ(r r r) σ(r r r ′ ) . So called the pair correlation function is a measure of the degree of correlation between σ(r r r) and σ(r r r ′ ). If all the couplings ζ 2 (r r r) are set equal to ζ 2 , it follows from (A.3) that {r r r,r r r ′ } Γ (r r r, r r r ′ ; ζ 1 , ζ 2 ) = N ϕT , (3.13) which is a fluctuation-response function relationship. If translational invariance is assumed, then Γ (r r r, r r r ′ ; ζ 1 , ζ 2 ) = Γ (r r r; ζ 1 , ζ 2 ), wherer r r := r r r − r r r ′ and {r r r} The restriction of this presentation to a hypercubic lattice is in the interests of simplicity. It can easily be generalized to other lattices. 38 It is convenient for our discussions to suppose that the microsystems are confined to the sites of a lattice. It is, of course, the case that a whole area of statistical mechanics concerned with fluid systems treats the case of microsystems/molecules moving in a continuum.
We are now able to augment the scaling theory, described in Sect. 2.4, by applying it to the correlation function and correlation length. Again adopting the magnetic model used in of Sect. 2.4, suppose that near a critical point these functions can be re-expressed in terms of the scaling fields θT and θH;r r r and k k k can also be treated as scaling fields which, on dimensional grounds will have exponents −1 and +1 respectively. Then the relationships (3.14) between the correlation function and the response function ϕT , together with the formula (A.11) derived from Ginzburg-Landau theory suggests a scaling form 41 for the correlation function, and, hence for its Fourier transform. Then, from (3.15), the scaling form for the correlation length is Then, from (2.34) and (2.36), ν(2−η) = γ, which is the Fisher scaling law (Fisher 1969) and d ν = 2 − α, which is the Josephson hyper-scaling law (Josephson 1967). 42 39 Given by The prefactor c(d) is dependent on the number of dimensions d, in which the system is infinite. It can be show from Ginzburg-Landau theory that c(d) = 1/(2d) (Lavis 2015, Sect. 5.6). 41 The exponent of minus one forr r r is chosen on dimensional grounds. It is also equivalent to the rescaling of length in the renormalization group (item (iii) in Sect. 3.4.1). 42 This is the only scaling law which involves the dimension d of the system. For reasons which become evident if Ginzburg-Landau theory is used in the Gaussian approximation (Lavis 2015, Sect. 5.6) it becomes invalid when d > dUC, the upper-critical dimension. This is the dimension such that, when d ≥ dUC, critical exponents become dimensionally independent with the classical values given by, for example, the van der Waals fluid. For the Ising and similar non-quantum systems (see Appen. B) dUC = 4.

Transfer-Matrix Methods
As we have already shown S, Φ 1 and Φ 2 are all extensive functions of their extensive variables or none of them is. The message FSM--2 sent from statistical mechanics to thermodynamics is that the latter is the case, and in particular that is true only as an approximation for large systems. 43 We shall now substantiate this claim by considering a particular way to develop statistical mechanical models, namely the method of transfer matrices. Although, of course, statistical mechanics can model systems of microsystems (molecules) moving, as in a fluid, through a continuum of points, transfer matrix methods are restricted to microsystems confined to the points of a lattice. In principle lattices of any dimension can be considered, but we shall, for easy of presentation, consider only the two-dimensional case. A virtue of this development is that it can be clearly seen how it unfolds as the two lattice directions in which the system gets larger and then infinite are applied separately.
Consider a square lattice, of lattice spacing a, with NH sites in the horizontal direction, NV in the vertical direction, so that N = NHNV. This situation is like the one considered for finite-size scaling in Sect. 3.4.2, when extensivity can be considered separately in the two directions. Periodic boundary conditions are applied so that the lattice forms a torus with horizontal rings of NH sites and rings in a vertical plane of NV sites. 44 We suppose that the sites of the lattice are occupied by identical microsystems having ν possible states. 45 The state of the whole system is σ σ σ := (σ σ σ 1 ,σ σ σ 2 , . . . ,σ σ σ NH ), whereσ σ σ i , the state of the i-th vertical ring of sites, has one of NR := ν NV values. Given that contributions to the Hamiltonian arise (at least in the horizontal direction) only between first-neighbour sites the Hamiltonian can be decomposed into interactions between neighbouring rings of sites and within rings. The latter can be distributed between interacting pairs of rings so that the Hamiltonian takes the form of the sum of contributions of interactions between rings and it is straightforward to show that the partition function is expressible in the form where V V V is the NR-dimensional transfer matrix with elements consisting of the exponentials of the negatives of the inter-ring interactions. Assuming that V V V is diagonalizable, 46 it is an elementary algebraic result that its trace is equal to the sum of its eigenvalues, which in decreasing order of magnitude we denote as 43 In fact the Sackur-Tetrode formula for the entropy of a perfect gas given by (2.7) and treated there as an assumption is, when derived from statistical mechanics, also not completely extensive. This condition is achieved only when N is large and the Stirling formula for N ! is applied. 44 The point we are establishing with respect to extensivity is even more evident in systems with open boundaries. 45 The Ising model of Appen. B is an example of such a model with ν = 2. 46 The condition for this to be the case is that V V V is simple (Lancaster and Tismenetsky 1985, p. 146).
Λ (ℓ) (ζ 1 , ζ 2 , NV), ℓ = 1, 2, . . . , NR. Then, from (3.2) and (3.21), As we can see the factors NH and NV of N are 'buried' at different places in this expression and it is clear that the extensivity condition (3.20) is not satisfied and the negative aspect of the message FSM--2 from statistical mechanics to thermodynamics is justified. However, we can make some progress because, if all the elements of V V V are strictly positive, as will usually be the case, an important theorem of Perron (1907) (see also, Gantmacher 1959, p. 64;Lavis 2015, p. 673) states that the largest eigenvalue of V V V is real, positive and non-degenerate. This means that, in the approximation when NH becomes large, with extensivity achieve in the horizontal direction. Two strategies emerge at this point: The first is to calculate an expression of the form valid in the limit NV → ∞ and giving in the limit N → ∞. If this calculation can be carried out it is an effective proof of the existence of the thermodynamic limit, 47 which achieves complete extensivity, with free-energy density given by (3.25). It is, however, a strategy that has been successfully applied in only a few cases, of which Onsager's (1944) solution of the two-dimensional zero-field Ising model and Baxter's (1972) solution of the eightvertex model are the most well-known instances.
In the absence of a complete solution as represented by (3.25), the strategy most often adopted is to treat NV as a parameter indexing a sequence of models. That is (3.27) In the case of the Ising and similar semi-classical models it can be shown by a method due to Peierls (1936) that φ (n) 2 (ζ 1 , ζ 2 ) is a smooth function for all n > 0 which exhibits maxima in response functions. A quantitative analysis using finitesize scaling theory (see Sect. 3.4.2) shows that such maxima become increasingly steep for increasing values of n, with convergence to the singularity associated with the transition in the two-dimensionally infinite system as n → ∞. In particular to the corresponding singularities in Onsager's solution of the two-dimensional zerofield Ising model. However, in view of the discussion later in this work it should be noted that the limiting process is singular. Although the maxima in the finite-NV models converge to the singularities in the NV = ∞ model they remain of a different (non-singular) character however large NV becomes.
The pair correlation function and correlation length were defined in Sect. 3.2. In terms of this transfer matrix formulation it can be shown (Lavis 2015, Sect. 11 where a, the lattice spacing, is now the distance between neighbouring rings of sites, and in the limit |r r r −r r r ′ | → ∞, where r r r and r r r ′ lie on the same vertical ring of sites which establishes an asymptotic form for f d (|r r r|/ξ) in (A.11).
The situation where NH → ∞ and NV is finite corresponds to that to be discussed in Sect. 3.4.2, below, for finite-size scaling, where here d := 1 and the thickness of the lattice ℵ := NV, with a maximum in ϕT and in other response functions signalling an incipient singularity. 48 The eigenvalue ratio Ω 2 (ζ 1 , ζ 2 , NV) can also be used as a means of detecting an incipient singularity, but in a slightly different way. Since, in Onsager's solution for the Ising model, the largest eigenvalue is degenerate along the first-order transition line below the critical temperature (Domb 1960, p. 194), we expect that Ω 2 (ζ 1 , ζ 2 , NV) will begin, as NV is increased, to form a 'plateau' with small (negative) slope for small temperature temperatures. The end of this plateau, where the negative curvature is a maximum can then be construed as the location of an incipient singularity. 49 The finite-size scaling argument of Sect. 3.4.2 can be applied to all these quantities showing that the maxima converge towards the infinite-system critical value as NV increases. However, of course, for finite NV we cannot expect these locations to exactly coincide. These perceptions are given further weight by the phenomenological renormalization group procedure described in Sect. 3.4.3 (c). As we have already indicated, the use of transfer matrix methods to determine exact solutions for infinite systems leads into our discussion in Sect. 3.5.1 of the thermodynamic limit. In a similar way our account of incipient singularities resulting from an analysis of systems with NV finite leads into our discussion of phase transitions in finite systems is Sect. 3.6.

The Renormalization Group Method
Once it became evident, around the turn of the twentieth century that the exponents associated with a critical point, both in experimental systems and theoretical models were not those derived from classical models, like van der Waals equation, an interest developed in determining their exact values, in experimental systems and also in theoretic models, where of course it was also necessary in many cases to derive the critical temperature. Before the advent of renormalization group methods the most successful way to do this was by using high and low temperature series. These were very successful in obtaining critical temperatures and exponents at second-order critical points. However, although they can be adapted to deal with first-order transitions, this is not their main strength and they are also not designed to map out the whole picture of phase transition curves in thermodynamic space. This contrasts with the renormalization group methods developed in the late sixties -early seventies. They are able (when they work) not only to deal with critical points but also curves of first-order and second-order transitions. However, any account of these methods should be proceeded by some words of warning, like those of John Cardy. As he says (Cardy 1996, pp. 28-29): "Not only are the words 'renormalization' 50 and 'group' 51 examples of unfortunate terminology, the use of the definite article 'the' which usually precedes them is even more confusing. It creates the misleading impression that the renormalization group is a kind of universal machine through which any problem may be processed, producing neat tables of critical exponents at the other end. This is quite false. It cannot be stressed too strongly that the renormalization group is merely a framework, a set of ideas, which has to be adapted to the nature of the problem at hand. In particular, whether or not a renormalization group approach is quantitatively successful depends to a large extent on the nature of the problem, but lack of success does not necessarily invalidate the qualitative picture it provides.
Here we shall concentrate solely on the approach to the renormalization group which is usually referred to as happening in 'real space'; in contrast to the approach initiated by Wilson (1975) where renormalization is performed in wavevector space resulting in expansions in the parameter ǫ := d − 4. 52 The core of real-space renormalization group (RSRG) methods is the construction of a semi-group of transformations on the independent couplings, or functions thereof. There is a variety of procedures for doing this. Many are based on the block-spin method of Kadanoff (1966), and another popular technique is decimation, where the states of a proportion of the microsystems is summed out of the partition function. In fact decimation applied to the one-dimensional Ising model, or related models like the Potts model, (see, e.g. Lavis 2015, Sect. 15.5.1, and Sect. 3.4.3 (a) below) is one of the few examples of an exact RSRG transformation. Most transformation involve approximations, which thus means that the critical exponents are approximations with, in many cases no obvious way to make improvements, unlike series methods where, in principle and often with a great deal of labour, improvements are made by extending the series. 50 A carry-over on the part of Wilson (1975) from his work on the high-energy behaviour of renormalized quantum electrodynamics. 51 It is in fact a semi-group since it has no inverse. 52 A comprehensive collection of articles on both types of renormalization group methods is contained in the articles in Domb and Green (1976) and on real-space methods in the volume edited by Burkhardt and van Leeuwen (1982). For a description of renormalization methods in wave-vector space the reader is referred to Ma (1976) and Amit (1978). Accounts of both approaches are given by Goldenfeld (1992), Binney et al. (1993) and Cardy (1996).
In essence the RSRG transformation involves some fractional reduction in the number of degrees of freedom. It would, therefore, seem to follow that there must have been a prior application of the thermodynamic limit. Whether this is required for the renormalization group and, more generally, whether it is needed at all in the statistical mechanics of critical phenomena is a question that we return to in Sect. 3.5, following a brief account of the ideas involved in the RSRG.

General Theory
Underlying the semigroup of transformations on couplings, which is the real-space renormalization group, is a mapping from a lattice N to a lattice N . For the sake of simplicity we suppose that both are hypercubic lattices with periodic boundary conditions. Then: (i) The number of sites N and N of N and N are related by N = N/λ d , where λ > 1. (ii) The lattice spacings a andã of N and N are related byã = λa. (iii) The size of N is reduced by a length scaling |r r r| = |r r r|/λ.
The renormalization group is constructed by imposing onto the lattice transformation a statistical mechanical transformation. To do this we modify the Hamiltonian (3.6) to H ′ 2 (σ σ σ; ζ 0 , ζ ζ ζ; N ) := N ζ 0 + H 2 (σ σ σ; ζ ζ ζ; N ), ( 3.31) where, for reasons that will become evident below, we have added a term including a trivial coupling ζ 0 and, as in the presentation of scaling theory at the beginning of Sect. 2.4, generalized the number of non-trivial couplings from two to n, with ζ ζ ζ := (ζ 1 , ζ 2 , . . . , ζn). 53 The terminology 'trivial' signals the fact that, if in (3.8) H 2 is replaced by H ′ 2 and Z 2 by Bearing in mind the remarks of Cardy, given above, a successful application of this method depends on being able to construct relationships between the couplings ζ 0 , ζ ζ ζ in the system on N and the couplingsζ 0 ,ζ ζ ζ in the system on N , done in such a way that the values for the couplings for N place it in a critical region if and only the same is the case for the values of the couplings for N . Since the critical properties of a system are contained within the partition function the invariance of that function is a sufficient guarantee; and this is achieved by the relationship where the weight function w(σ σ σ,σ σ σ) satisfies {σ σ σ} w(σ σ σ,σ σ σ) = 1. (3.36) Running over the set of statesσ σ σ in (3.35) will, in principle, produce recurrence relationships 54ζ and forζ 0 a recurrence relationship which we choose, for convenience to express in the formζ (3.38) The 'in principle' caveat entered here is important. As we shall see it is rarely possible to implement this programme and to choose a weight function without some kind of approximation being applied. And it is frequently the case that consistency can be achieved only by increasing the value of n from its initial value. When this happens it is necessary, in order to apply repeated iterations, to backtrack and for the extra couplings to be included from the start. The importance of (3.38) is that it can be used, together with (3.33) and (3.34) to obtain the relationship ( 3.39) between the free-energy densities per lattice site at ζ ζ ζ andζ ζ ζ. Then, given that (3.37) can be iterated to produce a sequence of points ζ ζ ζ is the free-energy density at an initial point ζ ζ ζ (0) . Although this result seems to imply the need for an infinite number of iterations, this is clearly not possible in practical computations. It is, therefore, fortunate that it is usually found that this series converges after a very few iterations, allowing densities and response functions to be calculated (see the discussion Sect. 3.5.1).
A fixed point ζ ζ ζ ⋆ of (3.37) is associated with either a single-phase region or a critical region C in Ξ 2 . To analyze its nature we linearize with [△ △ △ζ ζ ζ (s) ] T := ζ ζ ζ (s) −ζ ζ ζ ⋆ to give △ △ △ζ ζ ζ (s+1) ≃ L L L ⋆ △ △ △ζ ζ ζ (s) , where L L L ⋆ is the fixed-point value of the matrix L L L with elements L ij := ∂K i /∂ζ j . In general L L L ⋆ is not symmetric, with different left and right eigenvectors w w w j and x x x j for the eigenvalue Λ j . It can then be shown 55 that in a neighbourhood of the fixed point there exist scaling fields θ j = θ j (△ △ △ζ ζ ζ), j = 1, 2, . . . , n which are smooth functions of the couplings with x x x j θ j . (3.42) 54 To be precise, the recurrence relationships are derived from (3.35) as relationships between the Boltzmann factors exp(ζ j ) and exp(ζ j ), j = 0, 1, . . . , n of the couplings. 55 Assuming that L L L ⋆ is a simple matrix.
which is a realization of the relationship between scaling fields and couplings described in Sect. 2.4.

Finite-Size Systems
This treatment of criticality, which plays an important role in our understanding of PTCP in real systems (see Sect. 4), was initiated by Fisher (1971) and Fisher and Barber (1972). 56 For simplicity we suppose, as in Sect. 3.3, that the system under consideration consists of N identical microsystems on the sites of a d-dimensional hypercubic lattice N d with N k sites in the k-direction and In a fully-finite system d = 0 and N (d) = N . We denote the critical region in the partially-infinite system, when d > dLC, by C(d; ℵ), with C(d; ∞) = C. Finite-size scaling theory can be applied both to a partially-infinite system, where there is the possibility of a critical region consisting of some kind of singular behaviour, and a fully-finite system where there is not. In a fully-finite system or a partially infinite system with d ≤ dLC the critical region is replaced by: Definition 1 For a fully-finite, or partially-infinite system with d ≤ dLC, a region IS(d; ℵ) in the space of couplings is one of incipiently singularity, 57 if in the limit ℵ → ∞, it maps into a critical region C of the infinite system.
Expressed in a slightly different way a system has an incipient singularity at certain size-dependent values of it couplings if, as the system size ℵ is increased, those values converge to ones where thermodynamic functions exhibit properties that have no finite limits.
The basic assertion of finite-size scaling is that θ ℵ := 1/ℵ can be treated as another scaling field with y ℵ = 1, meaning that θ ℵ is a relevant scaling field, and θ ℵ = 0 56 For a review see Barber (1983) and, for a collection of papers on finite-size scaling, Cardy (1988). 57 It should be noted that this is a slightly different usage from that in Lavis (2015, Chap. 11), where such occurrences are called 'incipient phase transitions'.
for the infinite system. The only condition required for this is that the system is sufficiently large for the renormalization group transformation in the space of all the other couplings to be unmodified by the finite size of the system. That is to say, that the renormalized couplings can be represented in the system. For simplicity we confine our attention to the simple magnetic system used in Sect. 2.4. The critical region for the infinite system is just a critical point T = Tc, H = 0 with scaling fields θT and θH, given by (2.27), measuring departures from this point. When the system has finite thickness (θ ℵ = 0), the incipient singularity is at a different temperature, but because of the symmetry of the system still with H = 0. Again, for simplicity, attention will be restricted to the zero-field axis where two temperatures come into play: (i) For a system of finite thickness ℵ, T (ℵ) is the shift temperature such that, as ℵ → ∞, T (ℵ) → Tc, the temperature at which the infinite system has a singularity. If d > dLC then T (ℵ) is also a critical temperature, but for the system of finite thickness. If d ≤ dLC, and in particular when d = 0 and the system is fully-finite, T = T (ℵ) is a quasicritical temperature (Fisher and Barber 1972) which is exhibited by a maximum in the susceptibility. 58 This temperature is an example of an incipient singularity. In keeping with the other assumptions of scaling theory it is assumed that this convergence is algebraic, so, with scaling field where χ > 0 is the shift exponent, is sufficient to ensure convergence. (ii)T (ℵ), called the rounding temperature is an important, but rather more elusive, property of the system. It is the temperature at which the susceptibility first shows significant deviation from that of the fully-infinite system. With it is supposed that where τ > 0 is the rounding exponent.
Scaling around the infinite system critical point is shown in Fig. 6. Our interest 58 Another response function like the heat capacity can replace the susceptibility, with a slightly different quasicritical temperature. in this work is in the occurrence of an incipient singularity; so henceforth the assumption is that d ≤ dLC. 59 Thus we have three relevant scaling fields with the critical region of the infinite system at the origin (θT , θH, θ ℵ ) = (0, 0, 0). However, this is not the complete picture; in general there will be a number of irrelevant scaling fields, which parametrize the critical region and affect its asymptotic properties. For the sake of simplicity we just include the most nearly relevant. 60 of these designated as θ⋆, with exponent y⋆ < 0. Then on the zero-field axis (3.44) is replaced by φ sing (λ yT θT , λ y⋆ θ⋆, λθ ℵ ) = λ d φ sing (θT , θ⋆, θ ℵ ) . (3.49) As we have already seen, singular parts of thermodynamic functions like densities and response functions are obtained by differentiations with respect to the scaling fields. In particular, for the susceptibility ϕT , given by (A.7), with ω := 2yH − d = γ/ν, where γ is given by (2.34) and ν := 1/yT , given in (3.19), is the critical exponent of the correlation length. Asymptotic behavior in a neighbourhood of the critical point, that is when |θT ≪ 1, is then as usual exposed by choosing the scale parameter λ := |θT | −1/yT , giving where the ±1 branches of ϕT (±1, X⋆, X ℵ ) apply to the cases θT > 0 and θT < 0, respectively, and X⋆(T, ℵ) := |θT (T )| −y⋆ν θ⋆, X ℵ (T, ℵ) := |θT (T )| −ν ℵ −1 are scaling functions. In a similar way, with λ := ℵ, (3.52) In the thermodynamic limit ℵ → ∞, it follows from (3.51) that the susceptibility has the form ϕT (θT , θ⋆, 0) = A (±) which are, in general, different for θT > 0 and θT < 0, are dependent on θT by virtue of the presence of the irrelevant scaling field θ⋆. This contribution will become small, as |θT | −y * ν → 0 for |θT | → 0, eventually becoming negligible for sufficiently small |θT |. The susceptibility will then display an asymptotic algebraic singularity of the form The singularity is a divergence, if γ > 0, which is generally the case for response functions.
Given that both (3.51) and (3.52) are valid, and that a finite statistical mechanical system cannot exhibit non-analytic behaviour, whereas singular behaviour does occur at critical points in the limit of infinite system size, the scaling function ϕT (±1, X⋆, X ℵ ) in (3.51) must exhibit asymptotic behaviour of the form (3.56) Since the susceptibility has maxima along the curve T = T (ℵ) of shift temperatures in Fig. 6 these maxima will be in one of the branches of B (±) T (X⋆) with the other branch being a monotonically decreasing function of X⋆ in the vicinity of X⋆ = 0. Along the curve of shift temperatures, from (3.46), X⋆( T (ℵ), ℵ) ≃ C −νy⋆ s ℵ χνy⋆ θ⋆ and X ℵ ( T (ℵ), ℵ) ≃ C −ν s ℵ χν−1 . On this curve θ⋆ = 0, and if it is supposed that the two shift functions have the same asymptotic dependence on ℵ, the shift exponent will be related to yT = 1/ν and y⋆ < 0 by χ = yT /(1 − y⋆) with the shift amplitude As already indicated finite-size corrections to the pure power-law behaviour of ϕT , as described by (3.55), will begin to be observed whenever the system is finite (with θ ℵ := ℵ −1 = 0) at the rounding temperatureT (ℵ). It has been argued (Ferdinand and Fisher 1969) that this is the temperature at which the size ℵ of the system is of the same order as the correlation length ξ(T ). 61 It follows from (A.10) that |θT (T (ℵ), ℵ)| −ν ℵ −1 ≃ C, where C is a constant, which establishes, from (3.48), that C := Cr and the rounding exponent τ = 1/ν = yT with ω = γτ. Thus on the basis of some plausible assumptions we have the condition χ < τ, which, for large systems, motivates the disposition of the curves in Fig. 6.

Renormalization Schemes
The practical implementation of the renormalization group procedure in Sect. 3.4.1 involves the choice of a weight function and leads to recurrence relationships between systems related by a size parameter λ, together with a method for calculating the free-energy density which satisfies the scaling relationship. In (a) and (b) in this section we give examples of the implementation of two of the most commonly used weight functions and in (c) we briefly outline a different scheme which, using transfer matrix methods, relates the correlation lengths of systems of different sizes.
For d-dimensional lattices, most weight functions are based on a division of the lattice N into equal blocks of λ d sites. The mapping from N to N is given by associating each lattice siter r r ∈ N with a blocks of sites in N denoted by B(r r r).

(a) The decimation weight function
For this weight function the sites of N consist of a subset of the sites of N , chosen so that N forms a lattice which is isomorphic to N . So we can takẽ r r r ∈ B(r r r) with w(σ σ σ,σ σ σ) := {r r r} δ Kr (σ(r r r) − σ(r r r)) . (3.57) The effect of this is that the summation on the right-hand side of (3.35) is a partial sum over all the sites of the lattice N except those of N . For a range of one-dimensional models (including the Ising and Potts models), which can be solved exactly using transfer matrix methods, exact RSRG decimation transformations can also be obtained.
For the one-dimensional ferromagnetic case of the Ising model it can be shown (Nelson and Fisher 1975;Lavis 2015) that the most convenient variables are not those given in Appen. B but rather ζ 1 := tanh([2J + H]/2T ), ζ 2 := exp(−2H/T ), and for λ := 2, with the partial summation in (3.35) over alternate sites, the recurrence relationships take the form It is then not difficult to show that there is a fixed point ζ 1 = ζ 2 = 1 (T = H = 0), with both scaling exponents equal to d = 1. As we saw in the discussion of scaling theory in Sect. 2.4, an exponent equal to the dimension of the system is indicative of the possibility of a first-order transition. In this case the critical point is at zero temperature on the zero field line, meaning that the first-order coexistence curve has contracted to a point coinciding with the critical point at zero-temperature. At this point there is a first-order transition across the zero-field axis with a change of sign of the magnetization. It can also be shown that the curve which corresponds to the interaction J between microsystems being set to zero, is invariant under (3.58). At every point it has exponents 0 and −∞; the first of these is marginal, which indicates that the line consists of fixed points, and the latter that it is 'infinitely attractive' to points not on the line. The end points of the line ζ 1 = 0, ζ 2 = 1 (H = ∞, T = 0) and ζ 1 = 1, ζ 2 = 0 (T = ∞, H = 0) are fixed points in their own right in the invariant subspaces T = 0 and H = 0 respectively. The phase diagram is shown in Fig. 7.
Of course, for reasons just explained, the one-dimensional Ising model is less interesting than the two-dimensional model where the ferromagnetic critical point is not at zero temperature. So, suppose that we try to carry out the same procedure in that case. A possibility is to choose blocks of two sites  as shown in Fig. 8. The lattice N consists of the black sites and the partial summation in (3.35) is over the spin states on white sites. But this will create an interaction between the four sites surrounding each white site. So we would need to back-track and increase n from two to three, inserting this interaction from the beginning. But this would in turn generate an interaction between nine sites. And so on. This proliferation of interactions is typical of the problems encountered with decimation. The usual trick is to cut off the proliferation at a certain level. Such an approximation for this model was investigated by Wilson (1975) with a rather poor outcome compared to the known exact results.

(b) The majority-rule weight function
This weight function was introduced by Niemeijer and van Leeuwen (1973Leeuwen ( , 1974. The first step in assigningσ(r r r) for the block B(r r r) can be described in terms of the 'winner takes all' voting procedure used in some democracies. Given that each microsystem has ν states and that among the sites of B(r r r) one of the ν state occurs more that any other,σ(r r r) is assigned to have this value.
If ν := 2 and the number of sites λ d in a block is odd this rule works; a case in point being the treatment of the Ising model on the triangular lattice with a block of nine sites (λ := 3) by Schick et al. (1976). But unless these conditions hold it is clear that the simple majority rule is not sufficient to determine σ(r r r) for every configuration of the block. A 'tie' can occur in the voting procedure and a strategy must be adopted to deal with such cases. One possibility is to assign to σ(r r r) one of these predominating values on the basis of equal probabilities. In some cases this may not, however, be the most appropriate choice. In their work on the Ising model using a square first-neighbour block (λ := 2) Nauenberg and Nienhuis (1974) divided the configurations with equal numbers of up and down spins between block spins up and down with equal probabilities. The rule (one of four) which they chose ensured that the reversal of all the spins in the block reversed the block spin.

The Thermodynamic Limit
In the development of statistical mechanics represented by the right-hand column in Fig. 1 system size appears twice. Firstly in the passage for SM1 to SM2, where the system becomes large yielding approximate extensivity. This is needed for the discussion of finite-size phase transitions represented by SM5. Secondly in the other branch from SM2, via the thermodynamic limit, to an infinite system represented by SM3. This entails the identification of the infinite statistical mechanical system SM3 with thermodynamics, or at least the version, labelled TD3 in Fig. 1, of thermodynamics with some PTCP defined. But are SM3 and TD3 actually identical? The answer is clearly 'no'. TD3 is the result of a development in the left-hand column of Fig. 1, from the basic structure through the assumption of extensivity to a grafting on of a picture of PTCP, in the manner of Pippard (1957) or Buckingham (1972). On the other hand, as we have just indicated, SM3 is the result of a statistical mechanical development in the right-hand column in Fig. 1. It retains its microstructure with a probability distribution, and in most cases it is the result of the implementation of the thermodynamic limit for a particular model, the most well-known examples being the two-dimensional zero-field Ising model and the eight-vertex model. Thus it should be recognised for later reference (see Sect. 5.1) that this way of understanding the relation between thermodynamics and statistical mechanics involves the unwarranted conflation of two quite different pictures. Although one can argue that SM3 is an enrichment of TD3, since the former has all the features of the latter together with the extra ones provided by microstructure and precise results concerning critical values and exponents. That having been said, one may still question whether the thermodynamic limit is: 63 (1) Necessary, in principle, because statistical mechanics is not complete without it.
(2) Useful because calculations become much simpler in the thermodynamic limit and the relationship FSM--3 of SM3 to TD3 makes it easier to identify the order of phase transitions.
Although both of these possibilities deserve consideration it is the the first which has received the most attention, principally because of the role of the thermodynamic limit in the understanding of PTCP; this will be discussed in detail is Sect. 3.5.1. In this work we propose, in Sect. 4, a particular view of the usefulness of the thermodynamic limit in the context of phase transitions in finite systems. However, it is pertinent to note the range of possible circumstances calling for the use of the thermodynamic limit. In particular one might suppose an additional kind of necessity interposed between the two items in our list: (1a) Necessary in practice, because calculations for particular models are not tractable without its use.
However, of course, tractability, and hence necessity in practice, is ephemeral, evolving (one might hope) with an increase in computing power and technical ingenuity into mere usefulness.

Phase Transitions in Infinite Systems
The argument for the necessity in principle of the thermodynamic limit for PTCP effectively involves asserting the truth of the contradictory set of propositions: P-IA: PTCP occur in nature. P-IB: PTCP occur in nature as discontinuities in densities (first-order transitions) and as singularities in response functions (higher-order transitions). 64 P-IIA: PTCP in thermodynamics are defined by singularities in derivatives of first or higher order in the free energies and are treated as such using scaling theory. P-IIB: PTCP must necessarily be represented in thermodynamics by singularities. P-IIIA: PTCP should be able to be modelled in statistical mechanics. P-IIIB: PTCP should be modelled in statistical mechanics in the same way that they are in thermodynamics. A number of items in our list are indisputable and are not included in Callender's list: • That PTCP are defined in thermodynamics by singularities, can be confirmed by a visit to the thermodynamics section of any academic library (P-IIA is true). Whether it is necessary for thermodynamics to be formulated in this way (that P-IIB should be accepted), given a possible denial that PTCP occur in nature as singularities (that P-IB is true) is a different question. • The joint assertions that thermodynamic functions are regular for finite systems but can have singularities for infinite systems (included in our list as P-V and P-VI, respectively, but not contained in Callender's list) are facts about the mathematical structure of statistical mechanics which cause the total list to be contradictory.

And on Callender's list:
• It is difficult to argue that phase transitions do not occur in real systems (that P-IA (CP-II) is false), although it is plausible to deny that they arise as some kind of singularities (to argue that P-IB (not in Callender's list) is false), on the grounds that a first-order transition (say that between liquid water and water vapour) may look like a sudden change of density, but on closer observation would turn out to be a very steep continuous change. Likewise, apparent singularities in compressibility in fluids and susceptibility in magnets may just be very steep maxima. • It is also difficult to argue that real systems are not finite (that P-IV (CP-I) is false), given that no system takes up the whole of the universe. 66 A sort of argument could be constructed on the basis that no system is completely isolated, but this would mean accepting the need for computation, not with an infinite system as envisaged here, but with a system joined to a complicated and largely undetermined environment. • If the ability to model PTCP were not deemed to be a necessary part of statistical mechanics (P-IIIA (CP-IV) is rejected), then most of the work on statistical mechanics in the last half century and more would be pointless. It is, however, relevant here to mention the work of the late Ilya Prigogine (in particular, Prigogine 1996). Although, in a sense he accepts P-IIIA, it is a radically different form of statistical mechanics that he has in mind. From the assertion that "[a]s long as we consider merely a few particles, we cannot say if they form a liquid or gas" (ibid, p. 45) he concludes that "[s]tates of matter as well as phase transitions are ultimately defined by the thermodynamic limit. . . . Phase transitions correspond to emerging properties. They are meaningful only at the level of populations and not of single particles" (op. cit). This entails for him the reformulation of statistical mechanics so that the underlying dynamics in not that of trajectories but of measure. 67 There remain P-IB and P-IIIB, which together with P-IIB is equivalent to CP-III, and we now consider the consequences of denying one or both of them.
(i) If P-IB is accepted, that is that PTCP in nature do occur as singularities, then it is clearly necessary for thermodynamics to represent them in this way; P-IIB must be accepted. Then we seem to be driven toward the conclusion that statistical mechanics should model them in the same way (that is the acceptance of P-IIIB) which leads back to the contradiction. This is avoided by denying P-IIIB. Then PTCP can be modelled in statistical mechanics without singularities, by, for example, transfer matrix methods, while at the same time admitting that this is not the situation in reality. (ii) If P-IB is denied then it can be argued either: (a) That it is not necessary for thermodynamics to model PTCP as singularities (P-IIB is false). In this case P-IIIB can be accepted, with PTCP modelled without singularities in statistical mechanics, with thermodynamics reformulated to do the same. or (b) That in statistical mechanics PTCP should be modelled without singularities, but because for large systems steep maxima in response functions and steep changes in densities look very much like singularities and discontinuities, it is still necessary (on the grounds of tractability and 66 Which, in any event, may be finite. 67 This being the approach that he and his Brussels group also used to resolve the problem of irreversibility (see, for example, Prigogine 1999). simplicity) to model PTCP in thermodynamics as singular behaviour; P-IIB is accepted and P-IIIB is rejected. 68 So given that all of P-I to P-VI are accepted is there any way out of the paradox? One radical approach, which has already been noted, is that due to Prigogine, where statistical mechanics is reformulated to 'build in' the thermodynamic limit. 69 Somewhat similar, but less radical, is the approach of Robert Batterman, a philosopher of physics who has written extensively on questions related to phase transitions, the renormalization group and the thermodynamic limit (Batterman 2002(Batterman , 2005(Batterman , 2010(Batterman , 2011(Batterman , 2017(Batterman , 2019. Rather than formulating a novel form of the mechanics underlying statistical mechanics, his argument, following the lead of Kadanoff (2013a), is that the renormalization group is itself a novel approach, revolutionary in the sense of Kuhn (1963), which has the thermodynamic limit built in. His starting point is that thermodynamics 70 " is correct to represent [phase transitions] mathematically as singularities." (A: Batterman 2005, p. 234.) And: "Further, without the thermodynamic limit, statistical mechanics would completely fail to capture a genuine feature of the world. Without the thermodynamic limit, in fact, statistical mechanics is incapable even of establishing the existence of distinct phases of systems." (B: op. cit.) If there is any doubt about his view of real systems, this is dispelled by his forthright assertion that he wants "to champion the manifestly outlandish proposal that despite the fact that real systems are finite, our understanding of them and their behaviour requires, in a very strong sense, the idealization of infinite systems and the thermodynamic limit." (C: ibid, p. 231.) 'Outlandish' or not his position is one which would appear, in our experience, to be that adopted implicitly or explicitly by many working physicists, including, albeit in a radical sense as indicated above, by Prigogine, and Kadanoff (2000, p. 238), who asserts that the " existence of a phase transition requires an infinite system. No phase transitions occur in systems with a finite number of degrees of freedom." Kadanoff calls this the "extended singularity theorem" (Kadanoff 2013a, pp. 154-156) because "these singularities have effects that are spread out over large regions of space" (Kadanoff 2013b, p. 24). Having asserted that "the idea that we can find analytic partition functions that "approximate" singularities is mistaken, because the very notion of approximation required fails to make sense when the limit is singular, [which it is in this case because the] behaviour at the limit (the physical discontinuity, the phase transition) is qualitatively different from the limiting behaviour as that limit is approached." (D: ibid, p. 236) Batterman's proposal for resolving the puzzle is to resort to the renormalization group. In the next section this possibility is examined.
68 To preview Sect. 4, this is the position we shall defend. 69 This involves an extension of the Koopman (1931) formulation to a space beyond the Hilbert space in which it is set. 70 For reference in the summary (a)-(f) of his position on the renormalization group in Sect. 3.5.2 the quotations from Batterman's work are given labels A-F.

Infinite Systems and the Renormalization Group
'Infinity' as it arises in accounts of renormalization group methods consists not so much in the limiting process, evident in, say, Onsager's solution of the zero-field two-dimensional Ising model, whereby the dimensions of the system are taken to infinity, but rather in the perception that to make the method intelligible one must be working with a system which is already infinite (Palacios 2018(Palacios , 2019. To spell this out, a renormalization group scheme consists of the following: (i) In the space of couplings (or of functions thereof) a semigroup of transformations is derived which generates recurrence relationships under which any critical regions are invariant. (ii) In this 'dynamic system' the critical regions are the basins of attraction of critical fixed points. And there are sinks associated with non-critical regions (phases) of the system. (iii) A critical fixed point determines the universality class of the system at each point in its basin of attraction, with an associated set of critical exponents. (iv) In general a system may be able to be in more than one universality class determined by the symmetry group of the Hamiltonian when there is a particular relationship between the couplings. 71 It is clear that this way to do statistical mechanics is very different from the standard procedures (mean-field and other classical approximations, series expansions and exact solutions). So much so that, as we have already indicated, it is characterized by Kadanoff (2013a) as a Kuhn-type revolution, a view endorsed by Batterman (2017). The argument presented by Batterman concerning the whole question of singularites/real singular systems/the thermodynamic limit needs to be carefully rehearsed and for this his 2017 tribute to Leo Kadanoff provides the clearest account. He presents his view in contradistinction to that of Jeremy Butterfield who contends (Butterfield 2011b(Butterfield , p. 1077) that: "The use of the infinite limit . . . is justified, despite N being actually finite, by its being mathematically convenient and empirically correct (up to the required accuracy)." For an understanding of Batterman's view two quotes are particularly useful. In the first he asserts that: "the RG is not just a theory of the critical point, but rather it is a theory of the critical region. And this covers large but finite systems. So contrary to the line of reasoning presented [by Butterfield] the explanation of the behaviour of real finite systems requires the use of mathematical infinities, but does not require there to be infinite real systems." (E: Batterman 2017, p. 571.) At this point we have cause to be grateful to a referee of his paper, who objected that this quote was actually in line with "the claims of those supporting the idea that real phase transitions aren't sharp". In response to this Batterman added a footnote in which he clarified his position in the following way: "It seems to me that if one is going to hold that the use of the infinite limits is a convenience, then one should be able to say how (even if inconveniently) one might go about finding a fixed point of the RG transformation without infinite iterations. I have not seen any sketch of how that is to be done. The point is that the fixed point, as just noted, determines the behaviour of the flow in its neighbourhood. If we want to explain the universal behaviour of finite large systems using the RG, then we need to find a fixed point and, to my knowledge, this requires an infinite system."(F: op. cit.) So to summarize his view (using the labelled quotes A-F, given above): And to summarize the summary of Batterman's position: Although phase transitions in real systems are accompanied by singular behaviour, and in statistical mechanical models this singular behaviour is exhibited only by infinite systems, we don't need infinite systems, just the use of mathematical singularities, these being required to derive the fixed points in renormalization group calculations.
At this point we wish to challenge the last part of this statement by providing 72 the 'sketch' that Batterman (quote F) requires of the means of the determination of renormalization group fixed points.
The first thing to note is that the recurrence relationships (3.37) and (3.38) are derived (almost always with some approximations involved) between the couplings of two finite systems with sizes N and N with N/ N = λ d > 1. Once this is done no point in the space of couplings is intrinsically associated with a system of a particular size and, by the same token, fixed points, obtained from (3.37) with ζ j = ζ j , are not associated with infinite systems. However, if we were to choose to associate a particular system-size N with the first point of a trajectory, it would be necessary to assume only that we are working with a system large enough to allow the required number of iterations. 73 (Hence the inclusion of SM2 in the path from SM1 to SM4 in Fig. 1.) As Norton (2012, p. 222) says, fixed points are the "limit points" of the sequences generated by the recurrence relationships; the "mathematical pegs on which to hang limit properties" which are never reached in a finite 72 Based on practical experience (Young and Lavis 1979;Lavis 1979, 1980;Lavis et al. 1982;Lavis and Southern 1984). 73 Here we are agreeing with Batterman that an infinite system is not required.
number of iterations. They do not arise from an investigation of the properties of infinite limit systems, and, although they are properties of the transformation, iteration is not always needed for their determination. In some simple cases, like the one-dimensional Ising model described in Sect. 3.4.3, the fixed points can be extracted by direct analytic solution of the fixed point equations. But, in more complicated cases numerical computation comes into play. Although in principle iteration of the recurrence relationships starting from a point in the basin of attraction of a fixed point will generate a sequence of points approaching the fixed point, this is not usually a viable strategy for their determination. Since those of greatest interest, associated with critical regions, have both irrelevant directions of attraction within the critical region (the basin of attraction) and relevant directions along which the trajectory is driven away from the critical region. Except in special cases it is difficult to start a trajectory in a critical region, but nearby points are useful and possible. Then the trajectory will hover near the critical fixed point before it moves away to the sink associated with the phase containing the trajectory. These 'hover points' can be spotted by inspection of the computer output and used as initial guesses for a numerical solution of the fixed point equations. These kinds of numerical techniques, used also to map the critical regions themselves, provide a good picture of the whole phase diagram. And linearization of the recurrence relationships about the fixed points allows the critical exponents to be obtained.

Phase Transitions in Finite Systems: Mainwood's Proposal
Given, as we have concluded in the previous section, that the thermodynamic limit is not necessary to enable renormalization group calculations to provide the PTCP structure, is it still useful in other statistical mechanical treatments of PTCP? An assessment of usefulness, as distinct from necessity, is obviously heavily influenced by the position adopted with respect to whether PTCP occur in nature as singularities (P-IB). If it is false and real systems, by virtue of their size (∼ 10 23 microsystems) exhibit behaviour approximating to singular behaviour, in the sense, say, that the maximum in the compressibility of a fluid is experimentally indistinguishable from a singularity, then we have the means to remove the contradiction in the set of statements at beginning of Sect. 3.5.1. One way would be to deem it unnecessary for PTCP to be treated as singularities in thermodynamics (a denial of P-IIB). Although this would allow thermodynamics and statistical mechanics to be modelled in the same way (for P-IIIB to be accepted) we would argue, for the reasons given in Sect. 4, that it is not a tenable possibility. The alternative, which is the one discussed in this section, and which is favoured by ourselves, is to accept that thermodynamics must represent PTCP in terms of singularities (P-IIB) on the basis that this is an appropriate approximation to real systems. Thus rejecting the assertion that thermodynamics and statistical mechanics must model PTCP in the same way (P-IIIB), since statistical mechanics models phase transitions in finite systems. Given that real systems are very large (in terms of the number of microsystems) and finite, with phase transition giving the appearance, but not the exact reality of singularities, can calculations avoid using the thermodynamic limit? Or, more generally can recourse to a system where PTCP occur as singularities be avoided? Here we examine a proposal of Mainwood (2006) which definitely answers the question in the negative and in the next section we propose an answer which is more nuanced.
The definition of a phase transition provided by Mainwood (ibid,p. 28) can 74 be described in the following way. For a statistical mechanical system SN of size N with partition function Z 2 (ζ 1 , ζ 2 , N ), the free energy Φ 2 (ζ 1 , ζ 2 , N ) is given by (3.2) and satisfies (2.5) and (2.6). 75 Suppose that the thermodynamic limit exists, with φ 2 (ζ 1 , ζ 2 ) the free-energy density of the system S∞. Then: Definition 2 (ζ 1 , ζ 2 ) is a point with a particular criticality for SN iff (ζ 1 , ζ 2 ) is a point where S∞ has a singularity associated with this same criticality.
And Mainwood (ibid, p. 29) asserts that: 76 "Rather surprisingly, using this definition it is possible to hold on to all of Callender's four statements [(given above as CP-I to CP-IV)] without contradiction; though only in a Pickwickian sense -it is a "trick" possible only due to his choice of wording. Namely, the singularity referred to in [CP-III] is one not in the partition function [of SN ] but in [the partition function of S∞].
If this is regarded as a positive point in favour of Mainwood's definition, the overall conclusion seems to be more mixed. Mainwood 'worries' that: 77 (1) The definition means that a phase transition can be predicted in a finite system, however small it might be (ibid, p. 32).
(2) "While there exist standard procedures for taking the thermodynamic limit, . . . these procedures are human inventions, and choices could have been made differently. . . . The definition of a phase transition thus seems arbitrary in a disastrous sense: we can choose whether one is occurring or not by modelling it differently, or taking the limit according to a different scheme" (ibid, p. 31).
(3) "[T]he facts we need to decide whether or not [a physical system] is undergoing a phase transition should be physical facts, about actual states of affairs . . . They should not exist only in an idealized model on a theoretician's blackboard" (ibid, p. 29).
74 With some changes of notation to give conformity with our usage. 75 We have chosen the system with two independent couplings for convenience. 76 In relation to both this assertion and CP-III, the following quibble might not be out of place. The thermodynamic limit is taken for thermodynamic functions which are approximately extensive for large systems and become extensive in the thermodynamic limit (Griffiths 1964). The partition function is not of this sort, as one can see by using a little 'reverse engineering' to define the partition function of S∞ as Z∞(ζ 1 , ζ 2 , N ) := exp{−N φ 2 (ζ 1 , ζ 2 )}. Apart from the retained dependence on N , a singularity, which is an infinity of φ 2 (ζ 1 , ζ 2 ) would be a zero of Z∞(ζ 1 , ζ 2 , N ). In fact this brings to the fore a problem with CP-III. Phase transitions do not correspond to points "when the partition function has a singularity". For (say a lattice system) the partition function is, for a finite system, a polynomial whose zeros give singularities of the free energy, none of which lies on the positive real axis. In the thermodynamic limit a phase transition corresponds to a point of accumulation of zeros on the real axis. The quibble is resolved by replacing 'partition function' by 'free energy' in CP-III and Mainwood's assertion. 77 It is convenient to take his worries in reverse order.
Although Mainwood adds (1) as a final difficulty it is probably the one which would first spring to mind, since the definition would imply a phase transition in an Ising model of four spins in a square at the critical temperature given by Onsager's solution. Mainwood thinks that "this bullet can and should be bitten" (ibid, p. 32), but the consequences are not, we think, ones which would recommend themselves to any working physicists; not to put too fine a point on it, they would bring chaos to discussions of critical phenomena. The tractable alternative, also suggested by Mainwood, is to restrict the definition to large systems. 78 This would seem to us to be an inevitable step, but it also has consequences which we discuss in more detail below.
At one level both (2) and (3) are examples of the standard concern with respect to modelling, namely that we may not have a very good model which is not giving results which agree with experiment. And Mainwood's response to this is, as would be expected, that we should find a better model. But worry (3) also contains a second element, namely that his definition contains the use of a counterfactual, an idealized infinite model. His argument here is more complex and draws on a strong parallel with Lewis's (1986) analysis of counterfactuals. On this basis he argues that "it is the character of [the real finite system] that determines the nature of the infinite system that we then consider. When we draw conclusions about the nature of the phase transitions, they are conclusions about the character of [the real finite system], but by reference to the infinite model we can express them in a concise and illuminating form" (ibid, p. 30).
However we have worries of our own which do not seem to concern Mainwood. These can best be described by considering the transfer matrix treatment in Sect. 3.3, where, if we restrict attention to the two-dimensional square-lattice spin-1 taken as incipient singularities 80 and for increasing NV show good agreement with the Onsager result, which is the case NV = ∞.
However, the prescription to be applied by the Mainwood proposal is that their critical temperatures, for all NV, are the Onsager value. This would seem to us to reverse the order of the way of working of physicists. We think it is probably true to say that, with notable exceptions like Kadanoff (2009Kadanoff ( , 2013b, physicists involved in model calculations do not consider whether their interest is in very large systems or infinite systems. Their concern is whether a phase transition occurs. If they suppose that it does, one tool 81 to determine its location is to use transfer matrix calculations (Runnels and Combs 1966;Runnels et al. 1967;Bellemans and Nigam 1967;Lavis 1976). The method is to determine incipient singularities for as large a vertical width of system as possible as an estimate for the transition temperature for a very large/infinite width. Here one cannot use Mainwood's prescription to assign the infinite-width result to the finite-width systems, since the former is not known. 82 When, as in the case of the zero-field spin-1 2 Ising model, the infinite-width result is known exactly or has been determined to a good approximation by series methods, the motivation for determining finite-width results is to test the efficacy of the method, or to cross-check with other results.
In his discussion of Mainwood's proposal Butterfield (2011bButterfield ( , p. 1130 states it in a more restricted form. Again using our notation this is: Definition 3 A phase transition occurs in SN iff S∞ has non-analyticities. This Mainwood-Butterfield proposal has the advantage that it doesn't project a result from the infinite system onto finite systems of any size (or maybe onto just large-size systems). However, given that it asserts the existence of a phase transition in a finite system of any size N , where does this occur? At the maximum of one of the response functions (heat capacity or susceptibility/compressibility), or by extraction from the behaviour of the ratio of the two largest eigenvalue of the transfer matrix? These will all give different results, as will also the results of taking the limits in different ways and for differing numbers of dimensions, all of which in turn will differ with N . If all these values are taken to be estimates of some 'true' value will this be N -dependent or the same for all N , including presumably N = ∞, when we would be back with the problems of Mainwood's original proposal?

Phase Transitions in Large Systems: Our Proposal
As we shall see, our discussion in previous sections of the structure of thermodynamics and of statistical mechanics in general, and of PTCP in particular, will allow us to paint a more nuanced and quantitative picture of their relationship 80 As we have indicted in Sect. 3.3 maxima in other response functions and also the behaviour of the ratio of the two largest eigenvalues of the transfer matrix can also be used as identifiers of incipient singularities. 81 Among others, including duality transformations and series expansions. 82 Although one can, in some cases, prove the existence of a phase transition, even if the transition temperature is not known (Peierls 1936;Heilmann 1972Heilmann , 1980Heilmann and Huckaby 1979). than that provided by previous approaches. In particular we are concerned with the role in that relationship played by large finite systems. Mainwood suggests that we 'bite the bullet' by countenancing the possibility of phase transitions in small systems. However, we suggest that he is proposing to bite the wrong bullet. The one which should be bitten is the need for a criterion giving a demarkation in system size between small systems and large systems, and our proposal, which uses the discussion of finite-size scaling in Sect. 3.4.2, is intended to encompass this need.
Thermodynamics, on the one hand, characterises PTCP in terms of singularities of thermodynamic functions, which may occur at special values of externally controllable parameters. This characterisation appears, at first sight, to be warranted by the phenomenology of phase transitions as they are observed in nature -apparent discontinuities of thermodynamic functions at first-order phase transitions, and apparent algebraic singularities of thermodynamic functions including divergent response functions at second-order phase transitions. In statistical mechanics, on the other hand, singularities of thermodynamic functions can emerge only in the limit of infinite system size. As realistic systems are clearly of finite size, this creates an internal inconsistency in the list P-I to P-VI of propositions given above, if indeed the characterisation of PTCP as they occur in nature in terms of singularities (that is proposition P-IB) is accepted.
Our aim now is to present an argument, based on the account of finite-size scaling in Sect. 3.4.2, which shows that this inconsistency can be resolved within statistical mechanics and in a fully quantitative manner. In Sect. 3.4.2, and also here, discussion is restricted to a system with a thermal coupling θT and a magnetic coupling θH, in the cases where (i) it is fully-finite with thickness ℵ and (ii) it is fully-infinite with ℵ = ∞. In case (ii) on the zero-field axis H = 0, θH = 0 there is a critical temperature T = Tc with θT = 0 where response functions are singular. There is no singularity in the finite system but maxima appear in the response functions. We now summarize the relevant conclusions of finite-size scaling: FSS-I: In the thermodynamic limit ℵ → ∞ when θT is small, but not infinitesimal, the asymptotic form for the susceptibility at T = Tc, given by (3.53), has a singular component with exponent γ, but amplitudes which, by virtue of the presence of an irrelevant field θ⋆, are dependent on θT . FSS-II: As θT → 0, the influence of θ⋆ becomes negligible and the susceptibility exhibits a pure power-law singularity at T = Tc as described by (3.55). FSS-III: When ℵ is finite there is no singular behavior and two temperatures are defined: 83 the shift temperature T (ℵ) where the susceptibility has a maximum and the rounding temperatureT (ℵ) at which the profile of the susceptibility in the finite system begins to diverge from that in the infinite system. This renormalization group scaling approach to the description of critical phenomena thus explains in a quantitative way, how singularities that might occur in infinite systems are smoothed out by finite-size effects. This, being fully in line with the fundamental observation that statistical mechanical systems of finite size cannot exhibit any singularities, resolves the inconsistency in the list of propositions P-I to P-VI. In particular FSS-IV gives a quantitative measure of the deviations of critical phenomena, as observed in finite systems, from the behaviour expected for infinite system size. From (3.51), deviations from critical behaviour characteristic of the infinite system will be observable in a narrow region around the infinite system critical point. This, however, is precisely the region, where one would stand the chance of observing asymptotic singular behaviour , as only in this region is the influence of irrelevant scaling fields on PCTP expected to be sufficiently small. In order to observe asymptotic critical singularities it is thus required that |θT | be sufficiently small to keep corrections to asymptotic critical singularities due to irrelevant scaling fields under control, but also not too small , in order to prevent finite-size corrections from becoming significant. As the range of θT within which finite-size corrections dominate critical behaviour shrinks with system size ℵ like ℵ −1/ν , one has to choose systems sufficiently large in a quantitatively well-defined sense in order to be able to observe asymptotic critical singularities characteristic of the respective universality class of a system. In the context of the list P-I to P-VI of propositions, it is important to realise that the characterisation of PTCP in terms of singularities of thermodynamic functions constitutes an extrapolation of empirical observations, as properly establishing the existence of a discontinuity of a thermodynamic function would require experimental control of infinite precision, while establishing a divergence of a response function would require an actual measurement of an infinite quantity. Neither requirement can conceivably be met in any realistic experiment. Given that realistic systems contain O(10 23 ) constituents, the linear dimension ℵ of such systems, measured in terms of atomic distances, is very large and the temperature range over which finite-size corrections to singular behaviour would manifest themselves, will be very small. It is thus understandable that such effects have been beyond experimental resolution. 84 On the other hand, in computer simulations of statistical mechanical systems, one can handle only relatively small systems, and finite-size roundings of critical singularities are therefore quite prominent. In such situations such roundings, as predicted (and captured) by finite-size scaling are indeed observed and routinely used to extract asymptotic critical exponents from finite-size data (Binder 1981). The renormalization group and its formulation of finite-size scaling theory thus predicts in a quantitative way, both, the emergence of critical singularities, described as pure power-law singularities sufficiently close to an infinite system critical point, 85 and their shifting and rounding in systems of finite size.
According to our definite of a incipient singularity (Def. 1, above) such will occur in a finite system at certain values of their external parameters, if at those values thermodynamic functions exhibit properties that have no finite limits as the system size ℵ is increased. This could be a steep increase in the the slope of magnetization as a function of the external field across the zero-field axis at low temperatures, as shown in Fig. 9, which is indicative of the possibility of a first-order transition in the infinite system. Or it could be the size-dependent height of the maximum of a response function as shown in (3.52) with ω > 0, which is indicative of the possibility of a second-order transition in the infinite system. However it is important to note that an assertion of the occurrence of a incipient singularity in a finite system can never be made with absolute certainty by looking at the behaviour of a single system of any fixed finite size, but only by comparing the behaviour of systems of different sizes. That said, our investigations have now provided us with a well-defined notion of a large system: Definition 4 For a system to be counted as large it must be big enough to exhibit a range of values of a thermodynamic variable (for example, the temperature) within which the following two phenomena can both be avoided: (i) the corrections to scaling (due to the existence of non-zero irrelevant scaling fields) which require the system to be close to an incipient singularity, (ii) the noticeable finite-size corrections in a close neighbourhood of an incipient singularity (due to a finite value of ℵ), which requires the system to be sufficiently far away from an incipient singularity.
Although, as we saw above, these two conditions pull in opposite directions this tension will become less acute as the system size increases. For such systems incipient singularities will be observable in a range of temperatures (or couplings), which are described by the asymptotic critical exponents of infinite systems. These exponents describe incipient singularities which will never fully materialize in a system of finite extent. They do, however, provide an economy of description, and lead to a classification of systems according to their universality class, as described earlier.
Quite often the full complexity of the crossover between behaviour described by asymptotic critical exponents and finite-size rounding of thermodynamic functions is far beyond the capabilities of available analytic tools. Taking the thermodynamic limit in a statistical mechanical analysis of a system is also often, 86 the only way to carry the calculation through to its end.
The renormalization group approach to PTCP actually plays a dual role in the analysis of critical phenomena. 87 On the one hand it provides micro-reductive methods, firmly embedded in the arsenal of techniques of statistical mechanics, to evaluate critical exponents for given statistical mechanical systems, albeit in most cases only approximately. On the other hand it embodies a new way of looking at such systems, by describing statistical properties of systems at different length scales. It is this radically new way of analysing systems which allows it to put systems with different microscopic properties into a common context, which in turn leads to the identification of fixed points and their basins of attraction as universality classes, thereby revolutionizing the analysis of critical phenomena.
It is perhaps appropriate to add a final twist. Asymptotic critical exponents characterising singularities at phase transitions as they would occur in infinite systems, including exponents that describe corrections to scaling due to irrelevant scaling fields, are obtained from the eigenvalues of a renormalisation group transformation that is linearized in the vicinity of (one of) its fixed points. They are thus obtainable without ever touching or contemplating systems of infinite size! As we have seen in our discussion above, these critical exponents also govern the way in which finite-size corrections to critical phenomena manifest themselves. In some sense, therefore, it would be fair to say that critical exponents are bona-fide properties of finite systems -rather than, as mostly discussed, simply properties of potentially infinite systems.
The aim of our analysis has been to eliminate some of the confusion that has characterised much of the discussion surrounding PTCP in the philosophical (and physics) literature. To summarize our position: • It cannot be denied that phase transitions occur in nature. (P-IA is accepted). • The assertion that they are characterized by singularities is an unwarranted extrapolation of empirical findings. (P-IB is rejected). (Asserting the existence of a singularity in an experimental result requires infinitely precise experimental control, or an actual 'measurement of the infinite', which is clearly infeasible.) • Within thermodynamics, there is no choice but to describe phase transitions in terms of singularities. (That is, P-IIA and P-IIB are valid statements about the structure of thermodynamics). Equations of state either have unique solutions -in which case there is no phase transition -or they may exhibit bifurcations in their solution manifolds, in which case singularities and discontinuities arise. • Phase transitions, as they occur in nature, are correctly described by statistical mechanics, the renormalization group and finite-size scaling. Thermodynamics, on the contrary, is fundamentally incapable of an adequate description as it is, from the outset, conceived as a theory of infinitely large systems. (P-IIIA is accepted but P-IIIB is rejected). • Investigating systems in the limit of infinite system size provides added value in that it allows one to (i) identify exact asymptotic power laws, which the incipient singularities would follow if system sizes could be taken arbitrarily large, (ii) provide a classification of systems according to their universality class. Figure 1 is a diagrammatic attempt to encapsulate the relationship between thermodynamics including scaling theory, and the Gibbsian version of statistical mechanics including various approaches to PTCP: the use of the thermodynamic limit, the renormalization group and phase transitions in finite systems. Apart from the formal links spelled out as messages FSM--1, . . ., FSM--4 from statistical mechanics to thermodynamics and the connecting relationships FTD--1, . . ., FTD--3, provided by thermodynamics to statistical mechanics, there is another element of collaborative interaction, as shown in Fig. 1, in the direction from statistical mechanics to thermodynamics; specifically from the renormalization group to scaling theory. This has two aspects substantiation and enrichment:

After-Thoughts on Reduction and Emergence
• The Kadanoff scaling relationship (2.25) is introduced as a hypothesis, which is substantiated as (3.44) in renormalization group theory. • Scaling about an arbitrary origin in a critical region with relevant and irrelevant directions is a consequence of scaling theory. This picture is enriched in renormalization group theory, where scaling origins are not arbitrary, but fixed points of the flow determined by the recurrence relationships, and corresponding to different universality classes. Relevant and irrelevant directions correspond to directions in which a fixed point is repulsive and attractive to the flow. Following a trajectory as it approaches one fixed point, but is finally repulsed towards another, is an example of crossover between different types of critical behaviour, that is between different universality classes.
Having spelled out a picture of the anatomy of thermodynamics and statistical mechanics, as well as the relationships between their different parts, we can now ask what consequences this has for our understanding of reduction and emergence as regards PTCP. The literature on reduction and emergence is vast, even when restricted to the specific area of PTCP. So reviewing and discussing this entire literature is beyond the scope of this work, 88 with a more extensive treatment being the subject of a future paper. Our aim in this section is simply to sketch the main contours of the lie of the land in the light of the picture we have developed, hoping that this will serve as a springboard for further discussions.
To aid our account, we introduce the following terminology. Let TC and TF be two theories, where 'C' stands for 'coarse', meaning less detailed, and 'F' stands for 'finer', meaning more detailed. 89 Intuitively, TC is the theory that is supposed to be reduced to TF. In the terminology that has become standard in the philosophical literature on the topic, TF is supposed to be the reducing theory and TC is supposed to be the reduced theory. We say 'supposed to be' because this is what reductionists would expect. The question is whether this expectation bears out, and if so in what sense of reduction.
88 For surveys of the various positions in the discussion concerning reduction see Hüttemann and Love (2015) and Van Riel and Van Gulick (2019). For surveys of the discussions of emergence see Humphreys (2015) and the contributions of Gibb et al. (2019). For an overview of discussions of phase transitions see Shech (2018). 89 Butterfield (2011a, p. 928) replaces 'C' by 't' to stand for top, tangible or tainted and 'F' by 'b' to stand for bottom, basic or best.
Accounts of reduction might be divided into two broad families, called limit reduction and deductive reduction. 90 We now have a look at each in turn and consider whether they can account for the relation between TC and TF that emerges from our account.

Limit Reduction
The core idea of limit reduction is that TC reduces to TF if the former turns out to be a regular limit of the latter. An example of such a reduction is letting the parameter c, the speed of light in the special theory of relativity, tend toward infinity and thereby recovering classical Newtonian mechanics (Nickles 1973). 91 In general, let us call the relevant parameter α; the limit, denoted as limα, can be toward any value of α, the most frequent cases being α → 0 and α → ∞. Batterman (2020) adds the further requirement that the limit be regular, which means that the relevant formulae in TF approach the relevant formulae in TC smoothly as the parameter approach the relevant limit value. 92 Taking these elements together yields the following: Definition 5 Limit Reduction: TC limit-reduces to TF iff limα TF = TC and the limit is regular.
This definition plays an important role in the discussion about the reduction of PTCP because TC is commonly associated with thermodynamics and TF with statistical mechanics. The failure of the limit to be regular as the number of microsystems tends to infinity is then seen as an indication that reduction fails.
How does this argument play out in our scheme? To answer this question we first need to identify certain elements in Fig. 1 with TC and TF. There are two possibilities: (a) Work within the renormalization group SM4. In this case, as described in Sect. 3.4.1, the limiting process is implemented by the renormalization transformation which applies a succession of reductions in the number of lattices sites. This reduces the fluctuations (and correlation length) away from critical regions, but leaves the essential statistical mechanical structure intact. (b) Apply the infinite system limit SM2 → SM3. Away from critical regions this removes fluctuations in the uncontrolled extensive variables, but leaves the microstructure and the probability distribution intact.
However, neither of these is a reduction to a version of thermodynamics. Both (a) and (b) are procedures lying entirely within statistical mechanics. That having been said, (b) is probably the closest to the above idea of reduction. However, while it uses the thermodynamic limit, that limit does not take the system to a 90 The distinction goes back to Nickles (1973). Batterman (2020) calls them the "physicist's sense of reduction" and the "philosopher's sense of reduction". We avoid this terminology because, as we will argue, physicists do present deductive reductions. 91 We note that the terminology of what reduces to what varies. Nickles says that in taking the limit TF reduces to TC; in keeping with the terminology previously introduced, we say that TC reduces to TF. Nothing in what follows depends on this purely terminological matter. 92 See Berry (2002) and for discussions of singular and regular limits see Butterfield (2011b) and Nguyen and Frigg (2020).
thermodynamic system, but to an infinite statistical mechanical system (SM3). To arrive at thermodynamics it is necessary to conflate SM3 with TD3. While SM3 like TD3 contains the singular characteristics deemed necessary (by some) for the occurrence of phase transitions it also has a microstructure which is lacking in

TD3.
So there is no part of Fig. 1 which involves the kind of limit that would ground a limit reduction. However, far from being a problem, this is simply irrelevant to the issue of the reduction of PTCP. As we have indicated in Sect. 4 the role of the thermodynamic limit is, in the first instance, to provide a condition for maxima in response functions to be incipient singularities; some finite systems do not show PTCP no matter how large they become. In the second instance it provides the critical exponents that can be regarded as properties of the real system. Limits and renormalization group techniques are classification tools that enable us to separate phase transitions into different universality classes.

Deductive Reduction
This notion of reduction is closely associated with Nagel. The broad idea is that TC is reduced to TF if the laws of TC are deducible from the laws of TF and some auxiliary assumptions. A mature formulation of this idea, known as the Generalised Nagel-Schaffner Model of Reduction, is as follows: 93 Definition 6 Deductive Reduction: TC reduces to TF iff there is a corrected version T ⋆ C of TC such that: (i) Connectability : If TC contains terms that do not appear in to TF, then for every such term there is a bridge law connecting it to a term in TF. (ii) Derivability : Given the associations in (i), T ⋆ C is derivable from TF plus bridge laws and, possibly, some auxiliary assumptions. (iii) Strong Analogy : TC and T ⋆ C are strongly analogous to one another.
As a simple example, consider the derivation of the perfect gas law P V = N T (given as the second of equations (2.8)) from the kinetic theory of gases. Here the perfect gas law is TC and the kinetic theory is TF. TC contains the term 'temperature', which is not in TF. The bridge law T := 2Uε/(3N) (which is the first of equations (2.8)) with the internal energy U identified as the expectation value of the kinetic energy of the gas, connects this term to TF. T ⋆ C is the version of the perfect gas law in which, subject to the physical constraints on the system, P , V , and T are variables that can fluctuate (something they cannot do in TC). T ⋆ C and TC are strongly analogous in that fluctuations are small (to the point of being negligible) in contexts in which TC is applied.
The introduction of T ⋆ C is a concession to practice. Ideally one would be able to derive TC from TF, but that is usually not possible. So one rests content with deriving a theory T ⋆ C that is not identical with, but strongly analogous to, TC. What does it mean to be strongly analogous? Schaffner (1976) blocks the worry 93 The original reference is the first (1961) edition of Nagel (1979), of which Schaffner (1976) provide a reformulation. An alternative account by Butterfield (2011a,b) uses the notion of a definitional extension. Our presentation here follows that of Dizadji-Bahmani et al. (2010).
that an appeal to strong analogy is an entry ticket to 'anything goes' by imposing the following two conditions: (a) T ⋆ C corrects TC in that T ⋆ C makes more accurate predictions than TC. (b) TC is explained by TF through T ⋆ C being a deductive consequence of TF and T ⋆ C being strongly analogous to TC.
With this in place, we can now ask whether the above schema indicates that a deductive reduction is taking place. For this we first need to know which theories are in play: what is reduced to what? Since we are interested in a reduction of PTCP, we should focus on a version of thermodynamics with PTCP in it. So we set TC := TD3. Then it might seem tempting to choose SM3 as the reducing theory because, in Fig. 1, TD3 'communicates' with SM3. This, however, would be the wrong choice. What we are interested in is the reduction of thermodynamics to a fundamental theory of large systems, and this is SM2. This is because SM2 contains the fundamental principles of statistical mechanics with the only added assumption being that systems are large; so TF:= SM2 is appropriate. SM3, by contrast, contains a limit assumption which does not belong in the fundamental theory. So the task we set ourselves here is to check whether the reduction of TD3 to SM2 fits the mould of deductive reduction. We shall argue that it does and, to this end, we now consider this contention in relation how to the elements (i)-(iii) of Def. 6 play out in Fig. 1.
For (i) connectivity requires a number of bridge laws. We have avoided this designation for the relationships FTD--1, FTD--2 and FTD--3 in Fig. 1, preferring to call them 'inter-theory connections'. However, now we shall consider the possibility that they can assume the role of bridge laws as required in the present context. The paradigmatic example of a bridge law in the philosophical literature is provided, as indicated above, by the perfect gas. There the bridge law identifies the temperature in statistical mechanics using the underlying identification of the expectation value of kinetic energy of the gas with its internal energy. But, on closer examination, this example glosses over two other identifications, of volume and pressure. 94 In a perfect gas contained in a cylinder closed by a movable piston, the piston position will fluctuate; that is to say, from a statistical mechanical point of view, the volume of the gas is a fluctuating quantity. So, just as the internal energy must be identified with the expectation value of the kinetic energy, the thermodynamic volume must be identified with the expectation value of the statistical mechanical volume. Other instances of the same kind are provided by other systems and they are all covered by FTD--3, which in the current context plays the role of a bridge law. In the case of the perfect fluid the identification of internal energy and the expectation value of the kinetic energy and of the thermodynamic volume with the expectation value of the statistical mechanical volume is sufficient to provide a bridge for temperature, pressure and for entropy via the Sackur-Tetrode formula and consequentially for all other thermodynamic variables, as described by the connecting relationships FTD--1 and FTD--2. These could, therefore, be regarded as consequences of the underlying bridge law FTD--3, rather than as bridge laws in their own right. In more complicated situations, where there is a need to connect a larger set of thermodynamic and statistical 94 And also of the number of particles of the gas if it is enclosed in a permeable container.
mechanical variables, it is a reasonable economy to regard them, together with FTD--3 as comprising an exhaustive set of bridge laws.
For (ii) by definition T ⋆ C is a corrected version of TC that can be derived from TF plus bridge laws. In the current context T ⋆ C is a version of TD3 in which the relevant quantities are allowed to fluctuate, and the fluctuations show roughly the pattern given in SM2 (but without T ⋆ C containing any of the microstructure of matter specified in statistical mechanics). It is obvious that T ⋆ C thus defined is a deductive consequence of SM2: it is obtained simply by applying the bridge laws to SM2. 95 For (iii) we need to show that T ⋆ C and TC stand in the proper strong analogy relationship. In effect the derivation of SM3 from SM2 through the thermodynamic limit and the fact that SM3 corresponds to TD3 amounts to saying that there is a strong analogy between SM2 and TD3. However, a more detailed analysis is useful and for this we check whether Schaffner's two criteria are satisfied: For (a) the messages FSM--1 and FSM--2 are relevant. FSM--1 asserts that uncontrolled thermodynamic variables fluctuate with variances of O(N ) related to response functions. This means that the variances of the corresponding density variables are O(1/N ). That these fluctuations are small for large systems is related to, but not exactly equivalent to the fact, asserted in FSM--2, that extensivity is an approximate property of large systems. So T ⋆ C modifies TC by replacing equality in the basic relationship with approximate equality, valid when the system is large. It also contains fluctuationresponse function relationships between fluctuations, which are recognised in T ⋆ C but not in TC, and response functions which appear in both. Thus T ⋆ C makes more adequate predictions than TC because real systems do show fluctuations. For (b) the way that TC := TD3 is explained by TF := SM2 follows straightforwardly from Sect. 4 once the bridge laws are accepted and we have in place the definition of an incipient singularity (Def. 1). Maxima in response functions are identified as incipient singularities if they map into real singularities in the thermodynamic limit, which is the step from SM2 to SM3. And, as we have already noted, TD3 communicates with SM3 in the sense that it communicates its understanding of the singularities in SM3 to TD3.
From the above we conclude, that TD3 reduces to SM2 in the sense of deductive reduction. However, the structure of Fig. 1 prompts a consideration of the possibility of a further relationship of deductive reduction higher in the figure. In particular does TC := TD4 and TF := SM4 satisfy the required conditions? 96 It is straightforward to see that connectability and derivability, where T ⋆ C is a version of TD4 that has certain of the features of SM4 built into it, are satisfied as before. Scaling in TD4 is a phenomenological means of capturing the structure of the way 95 Terminological note: the term 'corrected', which is customary in the discussion of reduction, is somewhat ill-chosen, because it might suggest that that TD3 is in some way faulty, which it is not. It is in fact one of the most successful and enduring models in physics. The term 'corrected' here should be read in a unemphatic (and non-pejorative) way, as simply indicating that conditions (a) and (b), listed above, are met. 96 Replacing SM4 with SM5, would rather complicate the situation, since TD4 has extensivity in all d dimensions, whereas this is the case for only d dimensions (which includes the fully-finite case d = 0) in SM5.
thermodynamic functions in critical regions depend on variables (in the form of homogeneous functions of controllable variables). It can in a sense be regarded as being built from renormalization group theoory with the scaffolding removed. This is what we referred to above as the substantiation of scaling theory by the renormalization group. On the other hand the values of critical exponents and the interpretation of the origin of scaling as the fixed point of a semi-group transformation is absent from TD4 but present in SM4. In that sense the later provides an explanation or enrichment of the former.

Emergence
In the case of emergence things are even more difficult than with reduction. As Humphreys notes in a recent review of the field, not only is there no unified framework or account of emergence, there is not even a generally agreed set of core examples of emergent phenomena on which a discussion could build (Humphreys 2015). Our aim here is not, therefore, to comprehensively review the field; we rather discuss some senses of emergence that have played a role in the debate and assess whether, in the light of our analysis, PTCP are emergent in these senses.
For Butterfield, whose view of reduction is essentially Nagelian, there is no conflict between reduction and emergence. The view that reduction and emergence are compatible is based on an understanding of emergence as there being "properties or behaviour of a system which are novel and robust relative to some appropriate comparison class" (2011a, p. 921, orig. emph.). He adds the comment that this is intended to cover the case where a system consists of parts, where the idea is that a composite system's "properties and behaviour are novel and robust compared to those of its component systems, especially its microscopic or even atomic component" (op. cit.). We agree that thus understood, there is emergence in the large but finite systems we are studying and PTCP can be regarded as both emergent and reduced. Illustrative of this is the transfer matrix approach where maxima in response functions and the correlation length (or critical properties if d > dLC), calculated for a lattice which is infinite in d dimensions, converge towards the critical properties of a (d + 1)-dimensional system as the size in that dimension is increased. This account, affords a understanding of dimensional crossover between universality classes, with the 'gradual emergence' of critical behavior. Humphreys (2015) introduces the triplet of conceptual emergence, ontological emergence and epistemological emergence, which we now consider: (1) We have conceptual emergence "when a reconceptualization of the objects and properties of some domain is required in order for effective representation, prediction, and explanation to take place" (op. cit. p. 762). This is close to Butterfield's notion of reduction, and there is emergence in this sense because various notions that are not native to statistical mechanics, have been introduced into the theory in order to deal with PTCP, both through inputs from thermodynamics (FTD-1, FTD-2, and FTD-3) and through the introduction of the notion of a large system at level SM2. As we have argued in Sect. 4 and in our discussion of transfer matrix methods, it is precisely in such large systems that PTCP are manifested in the form of incipient singularities.
(2) Ontological emergence amounts to the following:"A ontologically emerges from B when the totality of objects, properties, and laws present in B are insufficient to determine A" (op. cit. p. 762). As we have seen in Sect. 4, the properties of a system's micro-constituents together with the laws that govern them are sufficient to determine PTCP; in fact they can be shown to happen in finite systems. So PTCPs are not ontologically emergent. (3) Epistemological emergence is present when the limitations in our knowledge prevent us from predicting the relevant phenomenon. As Humphreys puts it, A epistemically emerges from B "when full knowledge of the domain to which B belongs is insufficient to allow a prediction of A at the time associated with B" (op. cit. p. 762). This is also the notion of emergence that Morrison appeals to when she notes that "what is truly significant about emergent phenomena is that we cannot appeal to microstructures in explaining or predicting these phenomena, even though they are constituted by them" (Morrison 2012, p. 143). 97 We submit that PTCP are not epistemically emergent because, as we have seen in Sect. 4, they in fact can be deduced and predicted from the underlying micro-theory. What is important here is PTCP appear in finite systems.
Batterman's account of emergence (Batterman 2011), centres around the application of the renormalization group. As we have seen in Sect. 3.5.2 he (and Kadanoff) regard the use of renormalization group as a wholly different type of approach to PTCP from which novel properties emerge. In particular the fixed points of the renormalization transformation which allocate the universality classes. We agree with this except for two reservations: (i) Batterman takes the thermodynamic limit as an essential feature of this method. As we have indicated in Sect. 3.5.2 we do not regard this as being necessary. (ii) There is nothing automatic about setting up a renormalization group analysis of a system. It does not arise in a straightforward algorithmic way from the basic structure of statistical mechanics. Indeed physical insight is required both in the the choice of the lattice scaling N → N and of the weight function. These must be compatible with the nature of the ordered state and the critical phenomena to be explored. The recurrence relationships are determined by these choices, and the fixed points 'emerge' as properties of the recurrence relationships. These in turn have exponents which give the universality classes of the various critical regions. As we have already indicated, most renormalization schemes involve some degree of approximation, with a consequent variation in fixed points and their exponents. 98 However, weight-function dependent variations can also occur even when no approximation is involved. An example of this is the one-dimensional Ising model with the scheme described in Sect. 3.4.3 with λ := 2, but with J < 0, that is the antiferromagnetic case. In principle one expects a fixed point associated with antiferromagnetism, but, although the free-energy density is correctly computed the fixed point is missing. For this to appear, as is shown by Nelson and Fisher (1975), one needs to take λ := 3; that is blocks of three sites. That, in general, different fixed points and hence different universality classes emerge from different choices of lattice scaling and weight function for the same system means that this is a qualified type of emergence.
Finally, emergence is often characterised as the failure of reduction (Kim 1999, p. 21). That is, reduction and emergence are taken to be mutually exclusive and a property is emergent only if it fails to be reducible. PTCP are not emergent in this sense because, as we have seen above, they are reducible in the sense of a deductive reduction.

Conclusions
We have presented a picture of the way that thermodynamics and statistical mechanics coexist and collaborate within the envelope of thermal physics. We showed that the relationship between the two developments, represented by the columns in Fig. 1 depends, on the one hand, on inter-theory connecting relationships from thermodynamics to statistical mechanics, one of which, FTD--3, can, in the context of deductive reduction be regarded as a bridge law, with the remaining two, FTD--1 and FTD--2, being consequences of FTD--3. On the other hand, from statistical mechanics to thermodynamics, there is also a sequence of 'messages' that are effectively warnings about the idealized nature of thermodynamics.
We address the problem that real systems are finite, and singular behaviour associated with PTCP can occur only in infinite systems, using finite-size scaling and a clear specification of a large system. This enables us to develop a picture of the way that PTCP in finite systems can be defined in terms of incipient singularities. Within this picture the role of the infinite system is threefold: (a) the existence of a critical region in the thermodynamic limit is a necessary condition for there to be a region of incipient singularity in the real finite system, (b) as one (but not the only) way to determine quantitative properties like the value of critical exponents of the real system (c) to simplify calculations. In these senses the infinite system is an indispensable, idealized approximation to the real finite system.
The usual arguments for limit reduction are based on an unwarranted conflation between a thermodynamic system with critical behaviour (TD3) and an infinite statistical mechanical system (SM3). On the other hand, the arguments for the deductive reduction of TD3 to the statistical mechanics of a large system (SM2) are valid. Next we argue that PTCP are neither ontologically or epistemologically emergent, but they are conceptually emergent. Rather less frequently remarked upon are the ways that statistical mechanics both substantiates and enriches the picture of PTCP in thermodynamics.
we are concerned not only with its dependence on the couplings near to a critical region but also on its asymptotic form for a pair of widely separated lattice sites. However, the result Γ(r r r; ζ 1 , ζ 2 ) = f d (|r r r|/ξ) |r r r| d−2−η , (A.11) from Ginzburg-Landau theory (see, for example, Lavis 2015, Chap. 5) in which dependence on the couplings is mediated through the correlation length is believed to have wide applicability.

B The Ising Model
For simplicity we consider a d-dimensional hypercubic lattice N d of N sites with periodic boundary conditions. At r r r ∈ N d there is a microsystem σ(r r r) with values ±1 99 so that the microstate of the system is σ σ σ := {σ(r r r)} and the Hamiltonian is respectively, J being an energy parameter, so that J > 0 corresponds to ferromagnetic behavior, where the states of all first-neighbour pair of sites are aligned, and J < 0 corresponds to antiferromagnetic behaviour, where the states of all first-neighbour pairs are anti-aligned; 100 H is a magnetic field. It will be noted that these are the couplings introduced for the simple magnetic system in Sect. 2.4 except that there we used ε instead of J and it was assumed that ε > 0. This two-state model which was first solved for d = 1 by Ising (1925), 101 is usually now called the spin-1 2 Ising model. 102 On the basis of this solution, and a wealth of other results for larger dimensions (including the exact zero-field solution for a square lattice by Onsager (1944)) the ferromagnetic phase diagram in the space of the temperature T and the field H is known to have the form shown in Fig. 10, where, for d = 1, Tc = 0, for d = 2, Tc = 2.2692 J and, for d = 3, Tc = 4.5108 J. Apart from in the one-dimensional case where the critical point is at zero temperature (see Sect. 3.4.3(a)), there is a line of first-order transitions along the interval [0, Tc) of the zero-field axis across which the magnetization M changes between equal and opposite values. The universality class of the second-order transition at the critical point depends on the dimension of the system. The critical exponents for d = 2 are α = 0 (a logarithmic singularity), β = 1 8 , γ = 7 4 , δ = 15. For d = 3 the exponents are obtained to a high level of accuracy by series methods with α = 0.11008, β = 0.326419, γ = 1.237075 and δ = 4.78984. When d ≥ dUC = 4 the critical exponents take their classical values α = 0, β = 1 2 , γ = 1 and δ = 3. 103