Basics of stellar modelling
Stellar models are generally calculated under a number of simplifying approximations, of varying justification. In most cases rotation and other effects causing departures from spherical symmetry are neglected and hence the star is regarded as spherically symmetric. Also, with the exception of convection, hydrodynamical instabilities are neglected, while convection is treated in a highly simplified manner. The mass of the star is assumed to be constant, so that no significant mass loss is included. In contrast to these simplifications of the ‘macrophysics’ the microphysics is included with considerable, although certainly inadequate, detail. In recent calculations effects of diffusion and settling are typically included, at least in computations of solar models. The result of these approximations is what is often called a ‘standard solar model’, although still obviously depending on the assumptions made in the details of the calculation.Footnote 5 Even so, such models computed independently, with recent formulations of the microphysics, give rather similar results. In this paper I generally restrict the discussion to standard models, although discussing the effects of some of the generalizations. It might be noted that the present Sun is in fact one case where the standard assumptions may have some validity: at least the Sun rotates sufficiently slowly that direct dynamical effects of rotation are likely to be negligible. On the other hand, rotation was probably faster in the past and the loss and redistribution of angular momentum may well have led to instabilities and hence mixing affecting the present composition profile.
With the assumption of spherical symmetry the model is characterized by the distance r to the centre. Hydrostatic equilibrium requires a balance between the pressure gradient and gravity which may then be written as
$$\begin{aligned} {\mathrm{d}p \over \mathrm{d}r} = - {G m \rho \over r^2}, \end{aligned}$$
(1)
where p is pressure, \(\rho \) is density, m is the mass of the sphere contained within r, and G is the gravitational constant. Also, obviously,
$$\begin{aligned} {\mathrm{d}m \over \mathrm{d}r} = 4 \pi r^2 \rho . \end{aligned}$$
(2)
The energy equation relates the energy generation to the energy flow and the change in the internal energy of the gas:
$$\begin{aligned} {\mathrm{d}L \over \mathrm{d}r} = 4 \pi r^2 \left[ \rho \epsilon - \rho {\mathrm{d}\over \mathrm{d}t }\left( {e \over \rho }\right) + {p \over \rho }{\mathrm{d}\rho \over \mathrm{d}t }\right] ; \end{aligned}$$
(3)
here L is the energy flow through the surface of the sphere of radius r, \(\epsilon \) is the rate of nuclear energy generationFootnote 6 per unit mass and unit time, e is the internal energy per unit volume and t is time.Footnote 7 The gradient of temperature T is determined by the requirements of energy transport, from the central regions where nuclear reactions take place to the surface where the energy is radiated. The temperature gradient is conventionally written in terms of \(\nabla = \mathrm{d}\ln T / \mathrm{d}\ln p\) as
$$\begin{aligned} {\mathrm{d}T \over \mathrm{d}r} = \nabla {T \over p} {\mathrm{d}p \over \mathrm{d}r}. \end{aligned}$$
(4)
The form of \(\nabla \) depends on the mode of energy transport; for radiative transport in the diffusion approximation
$$\begin{aligned} \nabla = \nabla _{\mathrm{rad}}\equiv {3 \over 16 \pi a {\tilde{c}} G} {\kappa p \over T^4}{L(r) \over m (r)}, \end{aligned}$$
(5)
where \(\kappa \) is the opacity, a is the radiation energy density constant and \({\tilde{c}}\) is the speed of light. Finally, we need to consider the rate of change of the composition, which controls stellar evolution. In a main-sequence star such as the Sun the dominant effect is the burning of hydrogen; however, we must also take into account the changes in composition resulting from diffusion and settling. The rate of change of the abundance \(X_i\) by mass of element i is therefore given by
$$\begin{aligned} {\partial X_i \over \partial t} = {{\mathcal {R}}}_i + {1 \over r^2 \rho } {\partial \over \partial r} \left[ r^2 \rho \left( D_i {\partial X_i \over \partial r} + V_i X_i \right) \right] , \end{aligned}$$
(6)
where \({{\mathcal {R}}}_i\) is the rate of change resulting from nuclear reactions, \(D_i\) is the diffusion coefficient and \(V_i\) is the settling velocity.
To these basic equations we must add the treatment of the microphysics. This is discussed in Sect. 2.3 below.
I have so far ignored the convective instability. This sets in if the density decreases more slowly with position than for an adiabatic change, i.e.,
$$\begin{aligned} {\mathrm{d}\ln \rho \over \mathrm{d}\ln p} < {1 \over \varGamma _1}, \end{aligned}$$
(7)
where \(\varGamma _1 = (\partial \ln p / \partial \ln \rho )_{\mathrm{ad}}\), the derivative being taken for an adiabatic change. In stellar modelling this condition is often replaced by
$$\begin{aligned} {\mathrm{d}\ln T \over \mathrm{d}\ln p} \equiv \nabla > \nabla _{\mathrm{ad}} \equiv \left( \mathrm{d}\ln T \over \mathrm{d}\ln p \right) _{\mathrm{ad}}, \end{aligned}$$
(8)
which is equivalent in the case of a uniform composition.Footnote 8 Thus a layer is convectively unstable if the radiative gradient \(\nabla _{\mathrm{rad}}\) (cf. Eq. 5) exceeds \(\nabla _{\mathrm{ad}}\). In this case convective motion sets in, with hotter gas rising and cooler gas sinking, both contributing to the energy transport towards the surface. The structure of the convective flow should clearly be such that the combined radiative and convective energy transport at any point in the convection zone match the luminosity. The conditions in stellar interiors are such that complex, possibly turbulent, flows are expected over a broad range of scales (e.g., Schumacher and Sreenivasan 2020). Also, the convective flux at a given location obviously represents conditions over a range of positions in the star, sampled by a moving convective eddy, so that convective transport is intrinsically non-local. As a related issue, motion is inevitably induced outside the immediate unstable region, also potentially affecting the energy transport and structure, although this is often ignored. However, in computations of stellar evolution these complexities are almost always reduced to a grossly simplified local description which allows the computation of the average temperature gradient in terms of local conditions, as
$$\begin{aligned} \nabla = \nabla _{\mathrm{conv}}(\rho , T, L, \ldots ), \end{aligned}$$
(9)
applied in regions of convective instability (see Sect. 2.5).
The equations are supplemented by boundary conditions. The centre, which is a regular singular point, can be treated through a series expansion in r. For example, it follows from Eq. (2) for the mass and Eq. (1) of hydrostatic support that
$$\begin{aligned} m = {4 \over 3} \pi \rho _{\mathrm{c}} r^3 + \cdots , \quad p = p_{\mathrm{c}} - {2 \over 3} \pi \rho _{\mathrm{c}}^2 r^2 + \cdots , \end{aligned}$$
(10)
where \(\rho _{\mathrm{c}}\) and \(p_{\mathrm{c}}\) are the central density and pressure. A discussion of the expansions to second significant order in r, and techniques for incorporating them in the central boundary conditions, was given by Christensen-Dalsgaard (1982). At the surface, the model must include the stellar atmosphere. Since this requires a more complex description of radiative transfer than provided by the diffusion approximation (Eq. 33), separately calculated detailed atmospheric models are often matched to the interior solution, thus effectively providing the surface boundary condition. Simpler alternatives are discussed in Sect. 2.4.
The equations and boundary conditions are most often solved using finite-difference methods, by what in the stellar-evolution community is known as the Henyey technique (e.g., Henyey et al. 1959, 1964).Footnote 9 This was discussed in some detail by Clayton (1968) and Kippenhahn et al. (2012). The presence of the time dependence, in the energy equation and the description of the composition evolution, is an additional complication. The detailed implementation in the Aarhus STellar Evolution Code (ASTEC), used in the following to compute examples of solar models, was discussed in some detail by Christensen-Dalsgaard (2008).
An important issue is the question of numerical accuracy, in the sense of providing an accurate solution to the problem, given the assumptions about micro- and macrophysics. It is evident that the accuracy must be substantially higher than the effects of, for example, those potential errors in the physics which are investigated through comparisons between the models and observations. Ab initio analyses of the computational errors are unlikely to be useful, given the complexity of the equations. As discussed in the Appendix, computations with differing spatial and temporal resolution provide estimates of the intrinsic precision of the calculation. Additional tests, which may also uncover errors in programming, are provided by comparisons between independently computed models, with carefully controlled identical physics (e.g., Gabriel 1991; Christensen-Dalsgaard and Reiter 1995; Lebreton et al. 2008; Monteiro 2008).
Basic properties of the Sun
The Sun is unique amongst stars in that its global parameters can be determined with high precision. From planetary motion the product \(G M_\odot \) of the gravitational constant and the solar mass is know with very high accuracy, as \(1.32712438 \times 10^{26} \,\mathrm{cm}^3 \,\mathrm{s}^{-2}\). Even though G is the least precisely determined of the fundamental constants this still allows the solar mass to be determined with a precision far exceeding the precision of the determination of other stellar masses. The 2014 recommendations of CODATAFootnote 10 (Mohr et al. 2016) give a value \(G = 6.67408 \pm 0.00031 \times 10^{-8}\,\,\mathrm{cm}^3\,\,\mathrm{g}^{-1}\,\,\mathrm{s}^{-2}\), corresponding to \(\,M_\odot = 1.98848 \times 10^{33} \,\mathrm{g}\). However, the solar mass has traditionally been taken to be \(\,M_\odot = 1.989 \times 10^{33} \,\mathrm{g}\), corresponding to \(G = 6.672320 \times 10^{-8}\,\,\mathrm{cm}^3\,\,\mathrm{g}^{-1}\,\,\mathrm{s}^{-2}\); in the calculations reported in the present paper I use the latter values of \(\,M_\odot \) and G, even though these are not entirely consistent with the CODATA 2014 recommendations. I note that Christensen-Dalsgaard et al. (2005) found that variations to G and \(\,M_\odot \), keeping their product fixed, had very small effects on the resulting solar models.
The angular diameter of the Sun can be determined with very substantial precision, although the level in the solar atmosphere to which the value refers obviously has to be carefully specified. From such measurements, and the known mean distance between the Earth and the Sun, the solar photospheric radius, referring to the point where the temperature equals the effective temperature, has been determined as \(6.95508 \pm 0.00026 \times 10^{10} \,\mathrm{cm}\) by Brown and Christensen-Dalsgaard (1998); this was adopted by Cox (2000). Haberreiter et al. (2008) obtained the value \(6.95658 \pm 0.00014 \times 10^{10} \,\mathrm{cm}\), which within errors is consistent with the value of Brown and Christensen-Dalsgaard (1998). However, most solar modelling has used the older value \(\,R_\odot = 6.9599 \times 10^{10} \,\mathrm{cm}\) (Auwers 1891), as quoted, for example, by Allen (1973); thus, for most of the models presented here I use this value.
From bolometric measurements of the solar ‘constant’ from space the total solar luminosity can be determined, given the Sun-Earth distance, if it is assumed that the solar flux is independent of latitude; although no evidence has been found to question this assumption, it is perhaps of some concern that measurements of the solar irradiance have only been made close to the ecliptic plane. An additional complication is provided by the variation in solar irradiance with phase in the solar cycle of around 0.1%, peak to peak (for a review, see Fröhlich and Lean 2004); since the cause of this variation is uncertain it is difficult to estimate the appropriate luminosity corresponding to equilibrium conditions. The value \(\,L_\odot = 3.846 \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\) (obtained from the average irradiance quoted by Willson 1997) has often been used and will generally be applied here. However, recently Kopp et al. (2016) has obtained a revised irradiance, as an average over solar cycle 23, leading to \(\,L_\odot = 3.828 \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\).
The solar radius and luminosity are often used as units in characterizing other stars, although with some uncertainty about the precise values that are used. In 2015 this led to Resolution B3 of the International Astronomical UnionFootnote 11 (see Mamajek et al. 2015; Prša et al. 2016), defining the nominal solar radius \({{\mathcal {R}}}_\odot ^N = 6.957 \times 10^8 \,\mathrm{m}\), suitably rounded from the value obtained by Haberreiter et al. (2008), and the nominal solar luminosity \({{\mathcal {L}}}_\odot ^N = 3.828 \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\) from Kopp et al. (2016).
The solar age \(t_\odot \) can be estimated from radioactive dating of meteorites combined with a model of the evolution of the solar system, relating the formation of the meteorites to the arrival of the Sun on the main sequence. Detailed discussions of meteoritic dating were provided by Wasserburg, in Bahcall and Pinsonneault (1995), and by Connelly et al. (2012). Wasserburg found \(t_\odot = 4.570 \pm 0.006 \times 10^9\) years, with very similar although more accurate values obtained by Connelly et al. Uncertainties in the modelling of the early solar system obviously affect how this relates to solar age. For simplicity, in the following I simply identify this age with the time since the arrival of the Sun on the main sequence.Footnote 12 Despite the remaining uncertainty this still provides an independent measure of a stellar age of far better accuracy than is available for any other star.
The solar surface abundance can be determined from spectroscopic analysis (for reviews, see Asplund 2005; Asplund et al. 2009). Additional information about the primordial composition of the solar system, and hence likely the Sun, is obtained from analysis of meteorites. A major difficulty is the lack of a reliable determination from spectroscopy of the solar helium abundance. Lines of helium, an element then not known from the laboratory, were first detected in the solar spectrum;Footnote 13 however, these lines are formed under rather uncertain, and very complex, conditions in the upper solar atmosphere, making an accurate abundance determination from the observed line strengths infeasible; the same is true of other noble gases, with neon being a particularly important example. For those elements with lines formed in deeper parts of the atmosphere the spectroscopic analysis yields reasonably precise abundance determinations (e.g., Allende Prieto 2016); however, given that the helium abundance is unknown these are only relative, typically specified as a fraction of the hydrogen abundance. Detailed analyses were provided by Anders and Grevesse (1989) and Grevesse and Noels (1993), the latter leading to a commonly used present ratio \(Z_{\mathrm{s}}/X_{\mathrm{s}}= 0.0245\) between the surface abundances \(X_{\mathrm{s}}\) and \(Z_{\mathrm{s}}\) by mass of hydrogen and elements heavier than helium, respectively. Also, for most refractory elements there is good agreement between the solar abundances and those inferred from primitive meteorites. A striking exception is the abundance of lithium which has been reduced in the solar photosphere by a factor of around 150, relative to the meteoritic abundance (Asplund et al. 2009). This is presumably the result of lithium destruction by nuclear reaction, which would take place to the observed extent over the solar lifetime at a temperature of around \(2.5 \times 10^6 \,\mathrm{K}\), indicating that matter currently at the solar surface has been mixed down to this temperature. On the other hand, the abundance of beryllium, which would be destroyed at temperatures above around \(3.5 \times 10^6 \,\mathrm{K}\), has apparently not been significantly reduced relative to the primordial value (Balachandran and Bell 1998; Asplund 2004), so that significant mixing has not reached this temperature. These abundance determinations obviously provide interesting constraints on mixing processes in the solar interior during solar evolution (see Sect. 5.3).
Since 2000 major revisions of solar abundance determinations have been carried out, through the use of three-dimensional (3D) hydrodynamical simulations of the solar atmosphere (Nordlund et al. 2009, see also Sect. 2.5). This resulted in a substantial decrease in the inferred abundances of, in particular, oxygen, carbon and nitrogen (for a summary, see Asplund et al. 2009), resulting in \(Z_{\mathrm{s}}/X_{\mathrm{s}}= 0.0181\). The resulting decrease in the opacity in the radiative interior has substantial consequences for solar models and their comparison with helioseismic results; I return to this in Sect. 6.
Observations of the solar surface show that the Sun is rotating differentially, with an angular velocity that is highest at the equator. This was evident already quite early from measurements of the apparent motion of sunspots across the solar disk (Carrington 1863), and has been observed also in the Doppler velocity of the solar atmosphere. In an analysis of an extended series of Doppler measurements, Ulrich et al. (1988) obtained the surface angular velocity \(\varOmega \) as
$$\begin{aligned} {\varOmega \over 2 \pi } = (415.5 - 65.3 \cos ^2 \theta - 66.7 \cos ^4 \theta ) \, \mathrm{nHz} \end{aligned}$$
(11)
as a function of co-latitude \(\theta \), corresponding to rotation periods of 25.6 d at the equator and 31.7 d at a latitude of \(60^\circ \).
As discussed in Sect. 5.1, helioseismology has provided very detailed information about the properties of the solar interior. Here I note that the depth of the solar convection zone has been determined as 0.287R, with errors as small as 0.001R (e.g., Christensen-Dalsgaard et al. 1991; Basu and Antia 1997). Also, the effect of helium ionization on the sound speed in the outer parts of the solar convection zone allows a determination of the solar envelope helium abundance \(Y_{\mathrm{s}}\), although with some sensitivity to the equation of state; the results are close to \(Y_{\mathrm{s}} = 0.25\) (e.g., Vorontsov et al. 1991; Basu 1998).
Microphysics
Within the framework of ‘standard solar models’ most of the complexity in the calculation lies in the determination of the microphysics, and hence very considerable effort has gone into calculations of the relevant physics. In comparing the resulting models with observations, particularly helioseismic inferences, to test the validity of these physical results one must, however, obviously keep in mind potential errors in the approximations defining the standard models.
In this section I provide a relatively brief discussion of the various formulations that have been used for the physics. To illustrate some of the effects comparisons are made based on the structure of the present Sun discussed in more detail in Sect. 4 below. A detailed discussion of the physics of stellar interiors was provided by Cox and Giuli (1968) and updated by Weiss et al. (2004); for a concise review of the treatment of the equation of state and opacity, see Däppen and Guzik (2000).
Equation of state
The thermodynamic properties of stellar matter, defined by the equation of state, play a crucial role in stellar modelling. This directly involves the relation between pressure, density, temperature and composition. In addition, the adiabatic compressibility \(\varGamma _1\) affects the adiabatic sound speed (cf. Eq. 55) and hence the oscillation frequencies of the star, whereas other thermodynamic derivatives are important in the treatment of convective energy transport.
The treatment of the equation of state involves the determination of all relevant thermodynamic quantities, for example defined as functions of \((\rho , T,\{X_i\})\), where \(X_i\) are the abundances of the relevant elements; the composition is often characterized by the abundances X, Y and Z by mass of hydrogen, helium and heavier elements with, obviously, \(X + Y + Z = 1\). This should take into account the interaction between the different constituents of the gas, including partial ionization. Also, pressure and internal energy from radiation must be included, although they play a comparatively minor role in the Sun. An important constraint on the treatment is that it be thermodynamically consistent such that all thermodynamic relations are satisfied between the computed quantities (e.g., Däppen 1993). Thus it would not, for example, be consistent to add the contribution of Coulomb effects to pressure and internal energy without making corresponding corrections to other quantities, including the thermodynamical potentials that control the ionization.
A particular problem concerns ionization in the solar core. As pointed out by, e.g., Christensen-Dalsgaard and Däppen (1992) straightforward application of the Saha equation would predict a substantial degree of recombination of hydrogen at the centre of the Sun, yet the volume available to each hydrogen nuclei does not allow this. In fact, ionization must be largely controlled by interactions between the constituents of the gas, not included in the Saha equation, and often somewhat misleadingly denoted pressure ionization. These effects are taken into account in formulations of the equation of state at various levels of detail, generally showing that ionization is almost complete in the solar core. The simplest approach, which is certainly not thermodynamically consistent, is to enforce full ionization above a certain density or pressure.
A simple approximation to the solar equation of state is that of a fully ionized ideal gas, according to which
$$\begin{aligned} p \simeq {k_{\mathrm{B}}\rho T \over \mu m_{\mathrm{u}}}, \quad \nabla _{\mathrm{ad}}\simeq 2/5, \quad \varGamma _1 \simeq 5/3; \end{aligned}$$
(12)
here \(k_{\mathrm{B}}\) is Boltzmann’s constant, \(m_{\mathrm{u}}\) is the atomic mass unit and \(\mu \) is the mean molecular weight which can be approximated by
$$\begin{aligned} \mu = {4 \over 3 + 5 X - Z}. \end{aligned}$$
(13)
However, departures from this simple relation must obviously be taken into account in solar modelling. The most important of this is partial ionization, particularly relatively near the surface where hydrogen and helium ionize. Figure 1 shows the fractional ionization in a model of the present Sun. As discussed in Sect. 5.1.2 the effects of the ionization of helium on \(\varGamma _1\) provides a strong diagnostics of the solar envelope helium abundance.
Other effects are smaller but highly significant, particularly given the high precision with which the solar interior can be probed with helioseismology. Radiation pressure, \(p_{\mathrm{rad}} = 1/3 a T^4\), and other effects of radiation are small but not entirely negligible. Coulomb interactions between particles in the gas need to be taken into account; a measure of their importance is given by
$$\begin{aligned} \varGamma _{\mathrm{e}} = {e^2 \over d_{\mathrm{e}} k_{\mathrm{B}}T }, \quad \hbox {with} \quad d_{\mathrm{e}} = \left( {3 \over 4 \pi n_{\mathrm{e}}} \right) ^{1/3}, \end{aligned}$$
(14)
which determines the ratio between the average Coulomb and thermal energy of an electron; here e is the charge of an electron, and \(d_{\mathrm{e}}\) is the average distance between the electrons, \(n_{\mathrm{e}}\) being the electron density per unit volume. Also, in the core effects of partial electron degeneracy must be included; the importance of degeneracy is measured by
$$\begin{aligned} \zeta _{\mathrm{e}} = \lambda _{\mathrm{e}}^3 n_{\mathrm{e}} = {4 \over \sqrt{\pi }} F_{1/2} (\psi ) \simeq 2 e^\psi , \end{aligned}$$
(15)
where
$$\begin{aligned} \lambda _{\mathrm{e}} = {h \over (2 \pi m_{\mathrm{e}} k_{\mathrm{B}}T)^{1/2}} \end{aligned}$$
(16)
is the de Broigle wavelength of an electron, h being Planck’s constant and \(m_{\mathrm{e}}\) the mass of an electron. In Eq. (15) \(\psi \) is the electron degeneracy parameter and \(F_\nu (\psi )\) is the Fermi integral,
$$\begin{aligned} F_\nu (y) = \int _0^\infty {x^\nu \over 1 + \exp (y+x)} \mathrm{d}x. \end{aligned}$$
(17)
The last approximation in Eq. (15) is valid for small degeneracy, \(\psi \ll -1\); in this case the correction to the electron pressure \(p_{\mathrm{e}}\), relative to the value for an ideal non-degenerate electron gas, is
$$\begin{aligned} {p_{\mathrm{e}} \over n_{\mathrm{e}} k_{\mathrm{B}}T} -1 \simeq 2^{-5/2} e^\psi \simeq 2^{-7/2} \zeta _e \end{aligned}$$
(18)
(see also Chandrasekhar 1939). Finally, the mean thermal energy of an electron is not negligible compared with the rest-mass energy of the electron near the solar centre, so relativistic effects should be taken into account; their importance is measured by
$$\begin{aligned} x_{\mathrm{e}} = {k_{\mathrm{B}}T \over m_{\mathrm{e}} {\tilde{c}}^2}; \end{aligned}$$
(19)
at the centre of the present Sun \(x_{\mathrm{e}} \simeq 0.0026\). As an important example, the relativistic effects cause a change
$$\begin{aligned} {\delta \varGamma _1 \over \varGamma _1} \simeq - {2 + 2X \over 3 +5 X} x_{\mathrm{e}} \end{aligned}$$
(20)
in \(\varGamma _1\), which is readily detectable from helioseismic analyses (Elliott and Kosovichev 1998).
The magnitude of these departures from a simple ideal gas are summarized in Fig. 2, for a standard solar model. Given the precision of helioseismic inferences, none of the effects can be ignored. Coulomb effects are relatively substantial throughout the model, although peaking near the surface. Inclusion of these effects, in the so-called MHD equation of state (see below) was shown by Christensen-Dalsgaard et al. (1988) to lead to a substantial improvement in the agreement between the observed and computed frequencies. Electron degeneracy has a significant effect in the core of the model while, as already noted, relativistic effects for the electrons have been detected in helioseismic inversion (Elliott and Kosovichev 1998).
The computation of the equation of state has been reviewed by Däppen (1993, 2004, 2007, 2010), Christensen-Dalsgaard and Däppen (1992), Baturin et al. (2013). Extensive discussions of issues related to the equation of state in astrophysical systems were provided by Čelebonović et al. (2004). The procedures can be divided into what has been called the chemical picture and the physical picture. In the former, the gas is treated as a mixture of different components (molecules, atoms, ions, nuclei and electrons) each contributing to the thermodynamical quantities. Approximations to the contributions from these components are used to determine the free energy of the system, and the equilibrium state is determined by minimizing the free energy at given temperature and density, say, under the relevant stoichiometric constraints. The level of complexity and, one may hope, realism of the formulation depends on the treatment of the different contributions to the free energy. In the physical picture, the basic constituents are taken to be nuclei and electrons, and the state of the gas, including the formation of ions and atoms, derives from the interaction between these constituents. In practice, this is dealt with in terms of activity expansions (Rogers 1981), the level of complexity depending on the number of terms included.
A simple form of the chemical picture is the so-called EFF equation of state (Eggleton et al. 1973). This treats ionization with the basic Saha equation, although adding a contribution to the free energy which ensures full ionization at high electron densities. Partial degeneracy and relativistic effects are covered with an approximate expansion. Because of its simplicity it can be included directly in a stellar evolution code and hence it has found fairly widespread use; however, it is certainly not sufficiently accurate to be used for computation of realistic solar models. An extension of this treatment, the CEFF equation of state including in addition Coulomb effects treated in the Debye–Hückel approximation, was introduced by Christensen-Dalsgaard and Däppen (1992). A comprehensive equation of state based on the chemical treatment has been provided in the so-called MHDFootnote 14 equation of state (Mihalas et al. 1988, 1990; Däppen et al. 1988; Nayfonov et al. 1999). This includes a probabilistic treatment of the occupation of states in atoms and ions (Hummer and Mihalas 1988), based on the perturbations caused by surrounding neutral and charged constituents of the gas, and including excluded-volume effects. Also, Coulomb effects and effects of partial degeneracy are taken into account. The MHD treatment and other physically realistic equations of state are too complex (so far) to be included directly into stellar evolution codes. Instead, they are used to set up tables which are then interpolated to obtain the quantities required in the evolution calculation. Thus both the table properties and the interpolation procedures become important for the accuracy of the representation of the physics. Issues of interpolation were addressed by Baturin et al. (2019).
The physical treatment of the equation of state, for realistic stellar mixtures, has been developed by the OPAL group at the Lawrence Livermore National Laboratory, in what they call the ACTEX equation of state (for ACTivity EXpansion), in connection with the calculation of opacities. For this purpose it has obviously been necessary to extend the treatment to include also a determination of atomic energy levels and their perturbations from the surrounding medium. The result is often referred to as the OPAL tables. Extensive tables, in the following OPAL 1996, were initially provided by Rogers et al. (1996), with later updates presented by Rogers and Nayfonov (2002).
Interestingly, relativistic effects were ignored in the original formulations of both the MHD and the OPAL tables, while they were included, in approximate form, in the simple formulation of Eggleton et al. (1973). Following the realization by Elliott and Kosovichev (1998), based on helioseismology, that this was inadequate, updated tables taking these effects into account have been produced by Gong et al. (2001b) and Rogers and Nayfonov (2002). The latter tables, with additional updates, are known as the OPAL 2005 equation-of-state tablesFootnote 15 and are seeing widespread use.
To illustrate the effects of using the different formulations, Figs. 3 and 4Footnote 16 show relative differences in p and \(\varGamma _1\) for various equations of state at the conditions in a model of the present Sun, using the OPAL 1996 equation of state as reference. It is clear that the inclusion of Coulomb effects in CEFF captures a substantial part of the inadequacies of the simple EFF formulation, although the remaining differences are certainly very significant. In the bottom panel of Fig. 4 it should be noticed that the MHD and OPAL 1996 formulations share the lack of proper treatment of relativistic effects and hence have very similar behaviour of \(\varGamma _1\) at the highest temperatures. This is corrected in both CEFF and OPAL 2005 which therefore show very similar departures from OPAL 1996 at high temperature. A detailed comparison between the MHD and OPAL formulations was carried out by Trampedach et al. (2006).
Further developments of the MHD equation of state have been undertaken to emulate aspects of the OPAL equation of state in a flexible manner, allowing the calculation of extensive consistent and physically more realistic tables (Liang 2004; Däppen and Mao 2009), or developing a similar emulation in the simpler CEFF equation of state, which might enable bypassing the table calculations (Lin and Däppen 2010). A comprehensive update of the MHD equation of state is being prepared by R. Trampedach. The implementation of these developments in solar and stellar model calculations will be very interesting.
An independent development of an equation of state in the chemical picture has been carried out in the so-called SAHA-S formulation (Gryaznov et al. 2004; Baturin et al. 2013, 2017).Footnote 17 Results for this equation of state are shown in Fig. 4 with the blue long-dashed curve. Apart from a rather stronger variation in \(\varGamma _1\) in the atmosphere due to the wide variety of molecular species included, the SAHA-S formulation is clearly quite similar to OPAL 2005. Also, Alan W. Irwin has developed the FreeEOS formulation,Footnote 18 based on free-energy minimization (see Cassisi et al. 2003a), which allows rapid calculation of an equation of state that closely matches the OPAL equation of state.
Opacity
In stellar interiors, the diffusion approximation for radiative transfer, implied by Eq. (5), is adequate, and the opacity is determined as the Rosseland mean opacity,
$$\begin{aligned} \kappa ^{-1} \equiv \kappa _{\mathrm{R}}^{-1} = {\pi \over a {\tilde{c}}T^3} \int _0^\infty \kappa _\nu ^{-1} {\mathrm{d}B_\nu \over \mathrm{d}T} \mathrm{d}\nu \end{aligned}$$
(21)
(Rosseland 1924), where \(\kappa _\nu \) is the monochromatic opacity at (radiation) frequency \(\nu \) and \(B_\nu \) is the Planck function. The computation of stellar opacities is generally so complicated that opacities have to be obtained in stellar modelling through interpolation in tables. The computation of the tables includes contributions of transitions between the different levels of the atoms and ions in the gas, including as far as possible the effects of level perturbations; an extensive review of opacity calculations was provided by Pain et al. (2017). The thermodynamic state of the gas, including the degrees of ionization and the distribution amongst the levels, is an important ingredient in the calculation; indeed, both the MHD and the OPAL equations of state were developed as bases for new opacity calculations. Within the convection zone, solar structure is essentially independent of opacity, since the temperature gradient is nearly adiabatic. Below the convection zone the opacity is dominated by heavy elements; hence it is sensitive not only to the total heavy-element abundance Z but also to the relative distribution of the individual elements. This is illustrated in Fig. 5 showing the sensitivity of the opacity to variations in the dominant contributions to the heavy elements. Evidently iron is an important contribution to the opacity, particularly in the solar core, but other elements such as oxygen, neon and silicon also play major roles. Modelling the solar atmosphere requires low-temperature opacities, including effects of molecules; in the calculation of the structure of calibrated solar models the resulting uncertainties are largely suppressed by changes in the treatment of convection (cf. Fig. 28).
Early models used for helioseismic analysis generally used the Cox and Stewart (1970) and Cox and Tabor (1976) tables. An early inference of the solar internal sound speed (Christensen-Dalsgaard et al. 1985) showed that the solar sound speed was higher below the convection zone than the sound speed of a model using the Cox and Tabor (1976) tables, prompting the suggestion that the opacity had to be increased by around 20% at temperatures higher than \(2 \times 10^6 \,\mathrm{K}\). This followed an earlier plea by Simon (1982) for a reexamination of the opacity calculations in connection with problems in the interpretation of double-mode Cepheids and in the understanding of the excitation of oscillations in \(\beta \) Cephei stars; it was subsequently demonstrated by Andreasen and Petersen (1988) that agreement between observed and computed period ratios for double-mode \(\delta \) Scuti stars and Cepheids could be obtained by a substantial opacity increase, by a factor of 2.7, in the range \(\log T = 5.2{-}5.9\).
These results motivated a reanalysis of the opacities by the Livermore group, who pointed out (Iglesias et al. 1987) that the contribution from line absorption in metals had been seriously underestimated in earlier opacity calculations. This work resulted in the OPAL tables (e.g., Iglesias and Rogers 1991; Iglesias et al. 1992; Rogers and Iglesias 1992, 1994, in the following OPAL92). Owing to the inclusion of numerous transitions in iron-group elements and a better treatment of the level perturbations and associated line broadening these new calculations did indeed show very substantial opacity increases, qualitatively matching the requirements from the helioseismic sound-speed inference; also, this led largely to agreement with evolution models of the period ratios for RR Lyrae and Cepheid double-mode pulsators (e.g., Cox 1991; Moskalik et al. 1992; Kanbur and Simon 1994) and to opacity-driven instability in the \(\beta \) Cephei models (e.g., Cox et al. 1992; Kiriakidis et al. 1992; Moskalik and Dziembowski 1992). These results are excellent examples of stellar pulsations, and in particular helioseismology, providing input to the understanding of basic physical processes.
The OPAL tables, with further developments (e.g., Iglesias and Rogers 1996, in the following OPAL96),Footnote 19 have seen widespread use in solar and stellar modelling. In parallel with the OPAL calculations, independent calculations were carried out within the Opacity Project (OP) (Seaton et al. 1994), with results in good agreement with those of OPAL96 at relatively low density and temperature, although larger discrepancies were found under conditions relevant to the solar radiative interior (Iglesias and Rogers 1995). More recent updates to the OP opacities, in the following OP05, have decreased these discrepancies substantially, to a level of 5–10% (Seaton and Badnell 2004; Badnell et al. 2005).Footnote 20 A recent effort is under way at the CEA, France, resulting in the so-called OPAS tablesFootnote 21 (Blancard et al. 2012; Mondet et al. 2015). Also, the Los Alamos group has updated their calculations, in the OPLIB tables (Colgan et al. 2016).Footnote 22 A review of these recent opacity results was provided by Turck-Chièze et al. (2016), while Fig. 6 shows a comparison of the opacity values in a model of the present Sun.
The opacity tables discussed so far typically include few or no molecular lines. Thus the opacity at low temperature (often taken to be below \(10^4 \,\mathrm{K}\)) must be obtained from separate tables, suitably matched to the opacity at higher temperature. Tables provided by Kurucz (1991) and Alexander and Ferguson (1994) have often been used. A set of tables with a more complete equation of state and improved treatment of grains was provided by Ferguson et al. (2005).
I note that the potential uncertainties in the opacity calculations have gained renewed interest in connection with the apparent discrepancies between helioseismic inferences and solar models computed with revised inferences of solar surface composition. I return to this in Sect. 6.4.
Energy generation
The basic energy generation in the Sun takes place through hydrogen fusion to helium which may be schematically written as
$$\begin{aligned} 4 {{}^{1}\mathrm{H}}\rightarrow {{}^{4}\mathrm{He}}+ 2 \mathrm{e}^+ + 2 \nu _{\mathrm{e}}. \end{aligned}$$
(22)
Here the emission of the two positrons results from the required conversion of two protons to neutrons, as also implied by conservation of charge in the process, and the two electron neutrinos ensure conservation of lepton number. Evidently the positrons are immediately annihilated by two electrons, resulting in further release of energy. Thus the net reaction can formally be regarded as the fusion of four hydrogen atoms into a helium atom; this is convenient from the point of view of calculating the energy release based on tables of atomic masses. The result is that each reaction in Eq. (22) generates 26.73 MeV. However, the neutrinos have a negligible probability for interaction with matter in the Sun, and hence the energy contributed to the neutrinos must be subtracted to obtain the energy generation rate \(\epsilon \) actually available to the Sun. Thus \(\epsilon \) depends on the energy of the emitted neutrinos and hence on the details of the reactions resulting in the net reaction in Eq. (22). As discussed in Sect. 5.2 detection of the emitted neutrinos provides a crucial confirmation of the presence of nuclear reactions in the solar core and a probe of the properties of the neutrinos.
The detailed properties of nuclear reactions in stellar interiors have been discussed by, for example, Clayton (1968). Reactions require tunneling through the potential barrier resulting from the Coulomb repulsion between the two nuclei. Thus to a first approximation reactions between more highly charged nuclei are expected to have a lower probability. Also, the temperature dependence of the reactions depends strongly on the charges of the reacting nuclei. The dependence on temperature of the reaction rate \(r_{12}\) between two nuclei 1 and 2 is often approximated as \(r_{12} \propto T^n\), where
$$\begin{aligned} n = {\eta -2 \over 3}, \quad \eta = 42.487 ({{\mathcal {Z}}}_1 {{\mathcal {Z}}}_2 {{\mathcal {A}}})^{1/3} T_6^{-1/3}; \end{aligned}$$
(23)
here \({{\mathcal {Z}}}_1 e\) and \({{\mathcal {Z}}}_2 e\) are the charges of the two nuclei, \({{\mathcal {A}}} = {{\mathcal {A}}}_1 {{\mathcal {A}}}_2/( {{\mathcal {A}}}_1 + {{\mathcal {A}}}_2)\) is the reduced mass of the nuclei in atomic mass units, \({{\mathcal {A}}}_1\) and \({{\mathcal {A}}}_2\) being the masses of the nuclei, and \(T_6 = T/(10^6 \,\mathrm{K})\).Footnote 23 However, the specific properties of the interacting nuclei also play a major role for the reaction rate. Furthermore, the conversion of protons into neutrons and the production of neutrinos involve the weak interaction which takes place with comparatively low probability. This has a strong effect on the rates of reactions where this conversion takes place.
The net reaction in Eq. (22) obviously has to take place through a number of intermediate steps. The dominant series of reactions starts directly with the fusion of two hydrogen nuclei; the full sequence of reactions isFootnote 24
$$\begin{aligned} {{}^{1}\mathrm{H}}({{}^{1}\mathrm{H}}, \mathrm{e^+}\nu _{\mathrm{e}})\,{{}^{2}\mathrm{D}}({{}^{1}\mathrm{H}}, \gamma )\,{{}^{3}\mathrm{He}}({{}^{3}\mathrm{He}},2 {{}^{1}\mathrm{H}})\, {{}^{4}\mathrm{He}}. \end{aligned}$$
(24)
This sequence of reactions is known as the PP-I chain and clearly corresponds to Eq. (22). The average energy of the neutrinos lost in the first reaction in the chain is 0.263 MeV. Thus the effective energy production for each resulting \({{}^{4}\mathrm{He}}\) is 26.21 MeV.
Two alternative chains, PP-II and PP-III, continue with the fusion of \({{}^{3}\mathrm{He}}\) and \({{}^{4}\mathrm{He}}\) after the production of \({{}^{3}\mathrm{He}}\):
$$\begin{aligned} \begin{aligned} {{}^{3}\mathrm{He}}({{}^{4}\mathrm{He}}, \gamma )\,&{{}^{7}\mathrm{Be}}(\mathrm{e^-}, \nu _{\mathrm{e}})\,{{}^{7}\mathrm{Li}}({{}^{1}\mathrm{H}}, {{}^{4}\mathrm{He}})\,{{}^{4}\mathrm{He}}\qquad \qquad \,\, (\hbox {PP-II}) \\&\,\,\Downarrow \\&{{}^{7}\mathrm{Be}}({{}^{1}\mathrm{H}}, \gamma )\, {{}^{8}\mathrm{B}}(,\mathrm{e^+}\nu _{\mathrm{e}})\,{{}^{8}\mathrm{Be}}(, {{}^{4}\mathrm{He}})\,{{}^{4}\mathrm{He}}\qquad (\hbox {PP-III}) \end{aligned} \end{aligned}$$
(25)
Here the total average neutrino losses per produced \({{}^{4}\mathrm{He}}\) are 1.06 MeV and 7.46 MeV, respectively. At the centre of the present Sun the contributions of the PP-I, PP-II and PP-III reactions to the total energy generation by the PP chains, excluding neutrinos, are 23, 77 and 0.2%, respectively; owing to a much higher temperature sensitivity of the PP-II and PP-III chains the corresponding contributions to the solar luminosity are 77, 23 and 0.02%. However, even though insignificant for the energy generation, the PP-III chain is very important for the study of neutrino emission from the Sun due to the high energies of the neutrinos emitted in the decay of \({{}^{8}\mathrm{B}}\).
Of the reactions in the PP chains the initial reaction, fusing two hydrogen nuclei, has by far the lowest rate per pair of reacting nuclei. This is a result of the effect of the weak interaction in the conversion of a proton into a neutron, coupled with the penetration of the Coulomb barrier.Footnote 25 Thus the overall rate of the chains is controlled by this reaction; since the charges of the interacting nuclei is relatively low, it has a modest temperature sensitivity, approximately as \(T^4\) [cf. Eq. (23)]. The distribution of the reactions between the different branches depends on the branching ratios at the reactions destroying \({{}^{3}\mathrm{He}}\) and \({{}^{7}\mathrm{Be}}\); as a result PP-II and in particular PP-III become more important with increasing temperature, with important consequences for the neutrino spectrum of the Sun.
In principle, the full reaction network should be considered as a function of time, to follow the changing abundances resulting from the nuclear reactions. In practice the relevant reaction timescales for the reactions involving \({{}^{2}\mathrm{D}}\), \({{}^{7}\mathrm{Be}}\) and \({{}^{7}\mathrm{Li}}\) are so short that the reactions can be assumed to be in equilibrium under solar conditions (e.g., Clayton 1968); the resulting equilibrium abundances are minute.Footnote 26 On the other hand, the timescales for the reactions involving \({{}^{3}\mathrm{He}}\) are comparable with the timescale of solar evolution, at least in the outer parts of the core; thus the calculation should follow the detailed evolution with time of the \({{}^{3}\mathrm{He}}\) abundance. The resulting abundance profile in a model of the present Sun is illustrated in Fig. 7; below the maximum \({{}^{3}\mathrm{He}}\) has reached nuclear equilibrium, with an abundance that increases with decreasing temperature. The location of this maximum moves further out with increasing age. It was found by Christensen-Dalsgaard et al. (1974) that the establishment of this \({{}^{3}\mathrm{He}}\) profile caused instability to a few low-degree g modes early in the evolution of the Sun.
The primordial abundances of light elements, as inferred from solar-system abundances, are crucial constraints on models of the Big Bang (e.g. Geiss and Gloeckler 2007). This includes the abundances of \({{}^{2}\mathrm{D}}\) and \({{}^{3}\mathrm{He}}\), with \({{}^{2}\mathrm{D}}\) burning (cf. Eq. 24) taking place at sufficiently low temperature that the primordial \({{}^{2}\mathrm{D}}\) has largely been converted to \({{}^{3}\mathrm{He}}\). The \({{}^{3}\mathrm{He}}/{{}^{4}\mathrm{He}}\) ratio can be determined from the solar wind; the resulting value can probably be taken as representative for matter in the solar convection zone and hence provides a constraint on the extent to which the convection zone has been enriched by \({{}^{3}\mathrm{He}}\) resulting from hydrogen burning. This was used by, for example, Schatzman et al. (1981), Lebreton and Maeder (1987) and Vauclair and Richard (1998) to constrain the extent of turbulent mixing beneath the convection zone. Heber et al. (2003) investigated the time variation in the \({{}^{3}\mathrm{He}}/{{}^{4}\mathrm{He}}\) ratio from analysis of lunar regolith samples. After correction for secondary processes, using the presumed constant \({}^{20}\mathrm{Ne}/{}^{22}\mathrm{Ne}\) as reference, they deduced that the \({{}^{3}\mathrm{He}}/{{}^{4}\mathrm{He}}\) ratio has been approximately constant over the past around 4 Gyr, with an average value for the ratio of number densities of \((4.47 \pm 0.13) \times 10^{-4}\). This provides a further valuable constraint on the mixing history below the solar convection zone.Footnote 27
A second set of processes resulting in the net reaction in Eq. (22) involves successive reactions with isotopes of carbon, nitrogen and oxygen:
This CNO cycle is obviously a catalytic process, with the net result of converting hydrogen into helium. The reaction with the lowest rate in this cycle is proton capture on \({{}^{14}\mathrm{N}}\) which therefore controls the overall rate of the cycle; this leads to a temperature dependence of roughly \(T^{20}\) under solar conditions, owing to the high nuclear charge of nitrogen [cf. Eq. (23)]. As a result, the CNO cycle is significant mainly very near the solar centre, and its importance increases rapidly with increasing age of the model, due to the increase in core temperature (cf. Fig. 8a). Owing to the strong temperature dependence it is strongly concentrated near the centre, as illustrated in Fig. 8b. Thus, although in the present Sun the central contribution to the energy-generation rate is 11%, the CNO cycle only contributes 1.3% to the luminosity. As a consequence of the \({}^{14}\mathrm{N}\) bottleneck in the CN cycles almost all the initial carbon is converted into nitrogen by the reactions. An additional side branch mainly serves to convert oxygen into nitrogen; under the conditions leading up to the present Sun this is relatively unimportant, causing an increase in the central abundance of \({{}^{14}\mathrm{N}}\) by around 12% in the present Sun, relative to the initial abundance.
The computation of nuclear reaction rates requires nuclear parameters, determined from experiments or, in the case of the \({{}^{1}\mathrm{H}}+ {{}^{1}\mathrm{H}}\) reaction, from theoretical considerations. In addition to affecting the energy-generation rate the details of the reactions have a substantial effect on the branching ratios in the PP chains and hence on the production rate of the high-energy \({{}^{8}\mathrm{B}}\) neutrinos. The reaction rate, averaged over the thermal energy distribution of the nuclei, is typically expressed as a function of temperature in terms of a factor describing the penetration of the Coulomb barrierFootnote 28 and a correction factor provided as an expansion in temperature. A substantial number of compilations of data for nuclear reactions have been made, starting with the classical, and much used, sets by Fowler et al. (1967, 1975). Bahcall and Pinsonneault (1995) provided an updated set of parameters specifically for the computation of solar models. Two extensive and commonly used compilations of parameters have been provided by Adelberger et al. (1998) and Angulo et al. (1999). Revised parameters for the important reaction \({{}^{14}\mathrm{N}}({{}^{1}\mathrm{H}}, \gamma )\,{{}^{15}\mathrm{O}}\), which controls the overall rate of the CNO cycle, have been obtained (Formicola et al. 2004; Angulo et al. 2005), reducing the rate by a factor of almost 2. An updated set of nuclear parameters specifically for solar modelling was provided by Adelberger et al. (2011), including also the revised rates for \({{}^{14}\mathrm{N}}({{}^{1}\mathrm{H}}, \gamma )\,{{}^{15}\mathrm{O}}\).
The nuclear reactions take place in a plasma, with charged particles that modify the interaction between the nuclei. A classical and widely used treatment of this effect was developed by Salpeter (1954), with a mean-field treatment of the plasma in the Debye–Hückel approximation; this shows that the nuclei are surrounded by clouds of electrons which partly screen the Coulomb repulsion between the nuclei and hence increase the reaction rate. Following criticism of Salpeter’s result by Shaviv and Shaviv (1996), Brüggen and Gough (1997, 2000) made a more careful analysis of the thermodynamical assumptions underlying the derivation, confirming Salpeter’s result and in the second paper extending it to take into account quantum-mechanical exclusion and polarization of the screening cloud; in the solar case, however, such effects are largely insignificant. On the other hand, the mean-field approximation may be questionable in cases, such as the solar core, where the average number of electrons within the radius of the screening cloud is very small. This has given rise to extensive discussions of dynamic effects in the screening (e.g. Shaviv and Shaviv 2001). Bahcall et al. (2002) argued that such effects, and other claims of problems with the Salpeter formulation, were irrelevant. However, molecular-dynamics simulations of stellar plasma strongly suggest that dynamical effects may in fact substantially influence the screening (Shaviv 2004a, b). Further investigations along these lines are clearly needed. Thus it is encouraging that Mussack et al. (2007) started independent molecular-dynamics simulations. Initial results by the group (Mao et al. 2009) confirmed the earlier conclusions by Shaviv; a more detailed analysis by Mussack and Däppen (2011) found evidence for a slight reduction in the reaction rate as a result of plasma effects. Interestingly, Weiss et al. (2001) noted that the solar structure as inferred from helioseismology (cf. Sect. 5.1.2) can be used to constrain the departures from the simple Salpeter formulation; in particular, they found that a model computed assuming no screening was inconsistent with the helioseismically inferred sound speed. These issues clearly need further investigations.
Diffusion and settling
As indicated in Eq. (6) the temporal evolution of stellar internal abundances must take into account effects of diffusion and settling. Crudely speaking, settling due to gravity and thermal effects tends to establish composition gradients; diffusion, described by the diffusion coefficient \(D_i\), tends to smooth out such gradients, including those that are established through nuclear reactions. A brief review of these processes was provided by Michaud and Proffitt (1993). They were discussed in some detail already by Eddington (1926); he concluded that they might lead to unacceptable changes in surface composition unless suppressed by processes that redistributed the composition, such as circulation.
A brief review of diffusion was provided by Thoul and Montalbán (2007). The basic equations describing the microscopic motion of matter in a star are the Boltzmann equations for the velocity distribution of each type of particle. The treatment of diffusion and settling in stars has generally been based on approximate solutions of the Boltzmann equations presented by Burgers (1969). This results in a set of equations for momentum, energy and mass conservation for each species which can be solved numerically to obtain the relevant quantities such as \(D_i\) and \(V_i\) in Eq. (6). The equations depend on the collisions between particles in the gas, greatly complicated by the long-range nature of the Coulomb force between charged particles (electrons and ions); these are typically described in terms of coefficients based on the screened Debye–Hückel potential, mentioned above in connection with Coulomb effects in the equation of state and electron screening in nuclear reactions, and depending on the ionization state of the ions. As emphasized initially by Michaud (1970) the gravitational force on the particles may be modified by radiative effects, depending on the detailed ionization and excitation state of the individual species and hence varying strongly between different elements or with position in the star.Footnote 29 It should be noted that the typical diffusion and settling timescales, although possibly short on a stellar evolution timescale, are generally much longer than the timescales associated with large-scale hydrodynamical motions. Thus regions affected by such motion, particularly convection zones, can generally be assumed to be fully mixed; in the solar case microscopic diffusion and settling is only relevant beneath the convective envelope. Formally, hydrodynamical mixing can be incorporated by maintaining Eq. (6) but with a very large value of \(D_i\) (e.g., Eggleton 1971).
Michaud and Proffitt (1993) presented relatively simple approximations to the diffusion and settling coefficients for hydrogen as well as for heavy elements regarded as trace elements (see also Christensen-Dalsgaard 2008). These were based on solutions of Burger’s equations, adjusting coefficients to obtain a reasonable fit to the numerical results. These approximations were also compared with the results of the numerical solutions by Thoul et al. (1994) who in addition presented simpler, and rather less accurate, approximate expressions for the coefficients.
Although diffusion and settling have been considered since the early seventies (e.g., Michaud 1970) to explain peculiar abundances in some stars, it seems that Noerdlinger (1977) was the first to include these effects in solar modelling; indeed, the early estimates by Eddington (1926) suggested that the effects would be fairly small. In fact, including helium diffusion and settling Noerdlinger found a reduction of about 0.023 in the surface helium abundance \(Y_{\mathrm{s}}\), from the initial value. Roughly similar results were obtained by Gabriel et al. (1984) and Cox et al. (1989), the latter authors considering a broad range of elements, while Wambsganss (1988) found a much smaller reduction. Proffitt and Michaud (1991) provided a detailed comparison of these early results, although without explaining the discrepant value found by Wambsganss. Bahcall and Pinsonneault (1992a, b) made careful calculations of models with helium diffusion and settling, using the then up-to-date physics, and emphasizing the importance of calibrating the models to yield the observed present surface ratio \(Z_{\mathrm{s}}/X_{\mathrm{s}}\) between the abundances of heavy elements and hydrogen; they found that the inclusion of diffusion and settling increased the neutrino capture rates from the models by up to around 10%. A careful analysis of the effects of heavy-element diffusion and settling on solar models and their neutrino fluxes was presented by Proffitt (1994).
Gabriel et al. (1984) concluded that the inclusion of helium diffusion and settling had little effect on the oscillation frequencies of the model, while Cox et al. (1989), in their more detailed treatment, actually found that the model with diffusion and settling showed a larger difference between observed and model frequencies than did the model that did not include these effects. However, Christensen-Dalsgaard et al. (1993) showed that the inclusion of helium diffusion and settling substantially decreased the difference in sound speed between the Sun and the model, as inferred from a helioseismic differential asymptotic inversion. Further inverse analyses of observed solar oscillation frequencies have confirmed this result, thus strongly supporting the reality of these effects in the Sun and contributing to making diffusion and settling a part of ‘the standard solar model’ (e.g., Christensen-Dalsgaard and Di Mauro 2007). Further evidence is the difference between the initial helium abundance required to calibrate solar models and the helioseismically inferred envelope helium abundance (see Sect. 5.1.2), which is largely accounted for by the effects of helium settling.
Detailed calculations of atomic data for the OPAL and OP opacity projects (cf. Sect. 2.3.2) have allowed precise calculations of the radiative effects on settling (Richer et al. 1998). As mentioned above such effects are highly selective, affecting different elements differently. As a result, not only does the heavy-element abundance change as a result of settling, but the relative mixture of the heavy elements varies as a function of stellar age and position in the star. As is evident from Fig. 5 this has a substantial effect on the opacities. To take such effects consistently into account the opacities must therefore be calculated from the appropriate mixture at each point in the model, requiring appropriately mixing monochromatic contributions from individual elements and calculating the Rosseland mean (cf. Eq. 21). Such calculations are feasible (Turcotte et al. 1998) although obviously very demanding on computing resources in terms of time and storage. Turcotte et al. (1998) carried out detailed calculations of this nature for the Sun. Here the relatively high temperatures and resulting ionization beneath the convective envelope, where diffusion and settling are relevant, result in modest effects of radiative acceleration and little variation in the relative heavy-element abundances. In fact, Fig. 14 of Turcotte et al. shows that neglecting radiative effects and assuming all heavy elements to settle at the same rate, corresponding to fully ionized oxygen, yield results somewhat closer to the full detailed treatment than does neglecting radiative effects and taking partial ionization fully into account. The rather reassuring conclusion is that, as far as solar modelling is concerned, the simple procedure of treating all heavy elements as one is adequate (see also Turcotte and Christensen-Dalsgaard 1998). This simpler approach, neglecting radiative effects, is in fact what is used for the models presented here.
The timescale of diffusion and settling, defined by Eq. (6), increases with increasing density and hence with depth beneath the stellar surface, as illustrated in Fig. 9. Since the convective envelope is fully mixed, the relevant timescale controlling the efficiency of diffusion is the value just below the convective envelope. In the solar case this is of order \(10^{11}\) years, resulting in a modest effect of diffusion over the solar lifetime. In somewhat more massive main-sequence stars, however, with thinner outer convection zones, the time scale is short compared with the evolution timescale; in the case illustrated for a \(2 \,M_\odot \) star, for example, it is around \(5 \times 10^6\) years. Thus settling has a dramatic effect on the surface abundance unless counteracted by other effects (Vauclair et al. 1974). This leads to a strong reduction in the helium abundance, likely eliminating instability due to helium driving in stars that might otherwise be expected to be pulsationally unstable (Turcotte et al. 2000). Also, differential radiative acceleration leads to a surface mixture of the heavy elements very different from the solar mixture, which is indeed observed in ‘chemically peculiar stars’, as already noted by Michaud (1970). Richer et al. (2000) pointed out that to match the observed abundances even in these cases compensating effects had to be included to reduce the effects of settling; they suggested either sub-surface turbulence, increasing the reservoir from which settling takes place, or mass loss bringing fresh material less affected by settling to the surface. An interesting analysis of these processes in controlling the observed abundances of Sirius was presented by Michaud et al. (2011). To obtain ‘normal’ composition in such stars, processes of this nature reducing the effects of settling are a fortiori required;Footnote 30 since most main-sequence stars somewhat more massive than the Sun rotate relatively rapidly, circulation or hydrodynamical instabilities induced by rotation are likely candidates (e.g., Zahn 1992, see also Sect. 7). Deal et al. (2020) investigated the combined effects of rotation and radiatively affected diffusion in main-sequence stars and found that this could account for the observed surface abundances for stars with masses below \(1.3\,M_\odot \). For more massive stars additional mixing processes appeared to be required. It should also be noted that such hydrodynamical models of the evolution of rotation are unable to account for the rotation observed in the solar interior (see Sect. 5.1.4). A complete model of the transport of composition and angular momentum in stellar interiors remains to be found.
The near-surface layer
The treatment of the outermost layers of the model is complicated and affected by substantial physical uncertainties. In the atmosphere the diffusion approximation for radiative transport, implicit in Eq. (5), is no longer valid; here the full radiative-transfer equations need to be considered, including the details of the frequency dependence of absorption and emission. Such detailed stellar atmosphere models are available and can in principle be incorporated in the full solar model (e.g., Kurucz 1991, 1996; Gustafsson et al. 2008). However, additional complications arise from the effects of convection which induce motion in the atmosphere as well as strong lateral inhomogeneities in the thermal structure. Also, observations of the solar atmosphere strongly indicate the importance of non-radiative heating processes in the upper parts of the atmosphere, likely caused by acoustic or magnetic waves, or other forms of magnetic energy dissipation, for which no reliable models are available. The thermal structure just beneath the photosphere is strongly affected by the transition to convective energy transport, which determines the temperature gradient \(\nabla = \nabla _{\mathrm{conv}}\). Also, in this region convective velocities are a substantial fraction of the speed of sound, leading to significant momentum transport by convection described as a ‘turbulent pressure’, but most often ignored in the model calculations.
From the point of view of the global structure of the Sun, these near-surface problems are of lesser importance. In most of the convection zone the temperature gradient is very nearly adiabatic, \(\nabla \simeq \nabla _{\mathrm{ad}}\) (see also Fig. 12). Thus the structure is essentially determined by the (constant) value of the specific entropy \(s_{\mathrm{conv}}\); in other words, the variations of the thermodynamical quantities within this part of the convection zone lie on an adiabat. In fact, if the further approximation of a fully ionized ideal gas is made, such as is roughly valid except in the outer few per cent of the solar radius, \(\nabla _{\mathrm{ad}}\simeq 2/5\), \(\mathrm{d}\ln p / \mathrm{d}\ln \rho \simeq 5/3\), and the relation between pressure and density can be approximated by
$$\begin{aligned} p = K \rho ^\gamma , \end{aligned}$$
(27)
with \(\gamma = 5/3\). In this case, therefore, the properties of the convection zone are characterized by the adiabatic constant K. Such an approximation was generally used in early calculations of solar models (e.g., Schwarzschild et al. 1957). The structure of the convection zone determines its radial extent and hence affects the radius of the model. In the solar case the radius is known observationally with high precision; thus the adiabat of the adiabatic part of the convection zone [i.e., the value of K in the approximation in Eq. (27)] must therefore be chosen such that the model has the observed radius. This is part of the calibration of solar models (see Sect. 2.6).
From this point of view the details of the treatment of the near-surface layers serve to determine \(s_{\mathrm{conv}}\) (or K). This is obtained from the specific entropy at the bottom of the atmosphere through the change in entropy resulting from integrating \(\nabla - \nabla _{\mathrm{ad}}\) over the significantly superadiabatic part of the convection zone. The treatment of convection typically involves parameters that can be adjusted to control the adiabat and hence the radius of the model; given such calibration to solar radius, the structure of the deeper parts of the model is largely insensitive to the details of the treatment of the atmosphere and the convective gradient (for an example, see Fig. 31 below).
I note that although the detailed modelling of the near-surface layers has modest effect on the internal properties of calibrated solar models, they have a substantial effect on the computed oscillation frequencies which may affect the analysis of observed frequencies (see Sect. 5.1.1). Also, in computations of other stars no similar calibration based on the observed properties is generally possible. It is customary to apply solar-calibrated convection properties in these cases; although this is clearly not a priori justified, some support at least for only modest variations relative to the Sun over a substantial range of stellar parameters has been found from hydrodynamical simulations of near-surface convection (cf. Fig. 11).
Although the atmospheric structure can be implemented in terms of reasonably realistic models of the solar atmosphere, the usual procedure in modelling solar evolution is to base the atmospheric properties on a simple relation between temperature and optical depth \(\tau \), \(T = T(\tau )\); here \(\tau \) is defined by
$$\begin{aligned} {\mathrm{d}\tau \over \mathrm{d}r} = - \kappa \rho , \end{aligned}$$
(28)
with \(\tau = 0\) at the top of the atmosphere. This \(T(\tau )\) relation is often expressed on the form
$$\begin{aligned} T^4 = {3 \over 4} T_{\mathrm{eff}}^4 [\tau + q(\tau )], \end{aligned}$$
(29)
defining the (generalized) Hopf function q.Footnote 31 Given \(T(\tau )\), and the equation of state and opacity as functions of density and temperature, the atmospheric structure can be obtained by integrating the equation of hydrostatic support, which may be written as
$$\begin{aligned} {\mathrm{d}p \over \mathrm{d}\tau } = {g \over \kappa }, \end{aligned}$$
(30)
where the gravitational acceleration g can be taken to be constant, at least for main-sequence stars such as the Sun. This defines the photospheric pressure, e.g. at the point where \(T = T_{\mathrm{eff}}\), the effective temperature, and hence the outer boundary condition for the integration of the full equations of stellar structure.Footnote 32 The \(T(\tau )\) relation can be obtained from fitting to more detailed theoretical atmospheric models, as done, for example, by Morel et al. (1994), who used the Kurucz (1991) models. Alternatively, a fit to a semi-empirical model of the solar atmosphere can be used, such as the Krishna Swamy fit (Krishna Swamy 1966) or the Harvard-Smithsonian Reference Atmosphere (Gingerich et al. 1971). As an example, the Vernazza et al. (1981) Model C \(T(\tau )\) relation is shown in Fig. 10; here is also shown the result of using the following approximation for the Hopf function in Eq. (29):
$$\begin{aligned} q(\tau ) = 1.036 -0.3134 \exp (-2.448 \tau ) -0.2959 \exp (-30 \tau ). \end{aligned}$$
(31)
The approximation provides a reasonable fit to the observationally inferred temperature structure in that part of the atmosphere which dominates the determination of the photospheric pressure.
\(T(\tau )\) relations based on a solar \(q(\tau )\) are often used for general modelling of stars, even though the atmospheric structure may have substantial variations with stellar properties. An interesting alternative is to determine \(q(\tau )\), as a function of stellar parameters, from averaged hydrodynamical simulations of the stellar near-surface layers (e.g. Trampedach et al. 2014b). An example based on a simulation for the present Sun is also shown in Fig. 10.
Treatment of convection
A detailed review of observational and theoretical aspects of solar convection was provided by Nordlund et al. (2009), while Rincon and Rieutord (2018) focused on the largest clearly observed scale of convection on the solar surface, the supergranulation. Further details, including the treatment of convection in a time-dependent environment such as a pulsating star, were reviewed by Houdek and Dupret (2015). As discussed below, extensive hydrodynamical simulations have been carried out of the near-surface convection in the Sun and other stars. However, direct inclusion of these simulations in stellar evolution calculations is impractical, owing to the computational expense; thus we must rely on simpler procedures. It is obviously preferable to have a physically motivated description of convection; as discussed above (see also Sect. 2.6), solar modelling requires one or more parameters which can be used to adjust the specific entropy in the adiabatic part of the convection zone and hence the radius of the model. In stellar modelling convection is typically treated by means of some variant of mixing-length model (e.g. Biermann 1932; Vitense 1953; Böhm-Vitense 1958); a more physically-based derivation of the description was provided by Gough (1977a, b), in terms of the linear growth and subsequent dissolution of unstable modes of convection. In the commonly used physical description of this prescriptionFootnote 33 (for further details, see Kippenhahn et al. 2012) convection is described by the motion of blobs over a distance \(\ell \), after which the blob is dissolved in the surroundings, giving up its excess heat. If the temperature difference between the blob and the surroundings is \(\varDelta T\) and the typical speed of the blob is \(v\), the convective flux is of order \(F_{\mathrm{con}}\sim v c_p \rho \varDelta T\), where \(c_p\) is the specific heat at constant pressure. Assuming, for simplicity, that the motion of the blob takes place adiabatically, \(\varDelta T \sim \ell T (\nabla - \nabla _{\mathrm{ad}}) /H_p\), where \(H_p = - (\mathrm{d}\ln p/\mathrm{d}r)^{-1}\) is the pressure scale height. Also, the speed of the element is determined by the work of the buoyancy force \(- \varDelta \rho g\) on the element, where \(\varDelta \rho \sim - \rho \varDelta T/T\) is the density difference between the blob and the surroundings, assuming the ideal gas law and pressure equilibrium between the blob and the surroundings. This gives \(\rho v^2 \sim - \ell g \varDelta \rho \sim \rho \ell ^2 g (\nabla - \nabla _{\mathrm{ad}})/H_p\). Thus we finally obtainFootnote 34
$$\begin{aligned} F_{\mathrm{con}}\sim \rho c_p T {\ell ^2 g^{1/2} \over H_p^{3/2}} (\nabla - \nabla _{\mathrm{ad}})^{3/2}. \end{aligned}$$
(32)
To this must be added the radiative flux
$$\begin{aligned} F_{\mathrm{rad}}= {4 a {\tilde{c}}T^4 \over 3 \kappa \rho } {\nabla \over H_p} \end{aligned}$$
(33)
(cf. Eq. 5); the total flux \(F = F_{\mathrm{con}}+ F_{\mathrm{rad}}\) must obviously match \(L/(4 \pi r^2)\), for equilibrium. This condition determines the temperature gradient in this description.
This description obviously depends on the choice of \(\ell \); this is typically also regarded as a measure of the size of the convective elements. An almost universal, if not particularly strongly physically motivated, choice of \(\ell \) is to take it as a multiple of the pressure scale height,
$$\begin{aligned} \ell = \alpha _{\mathrm{ML}}H_p. \end{aligned}$$
(34)
From Eq. (32) it is obvious that \(F_{\mathrm{con}}\) then scales as \(\alpha _{\mathrm{ML}}^2\). Adjusting \(\alpha _{\mathrm{ML}}\) therefore modifies the convective efficacy and hence the superadiabatic gradient \(\nabla - \nabla _{\mathrm{ad}}\) required to transport the energy, thus fixing the specific entropy in the deeper parts of the convection zone. This in turn affects the structure of the convection zone, including its radial extent, and hence the radius of the star. As discussed in Sect. 2.6 the requirement that models of the present Sun have the correct radius is typically used to determine a value of \(\alpha _{\mathrm{ML}}\), which is then often used for the modelling of other stars.
In practice, further details are added. These involve a more complete thermodynamical description, the inclusion of factors of order unity in the relation for the average velocity and energy flux and expressions for the heat loss from the convective element. Although not of particular physical significance, the choice made for these aspects obviously affects the final expressions and must be taken into account in comparisons between different calculations, particularly when it comes to the value of \(\alpha _{\mathrm{ML}}\) required to calibrate the model. A detailed description of a commonly used formulation was provided by Böhm-Vitense (1958). It was pointed out by Gough and Weiss (1976) (see also Sect. 2.4) that solar models, with the appropriate calibration of the relevant convection parameters to obtain the proper radius, are largely insensitive to the details of the treatment of convection, although the specific values of \(\alpha _{\mathrm{ML}}\) may obviously differ. It is important to keep this in mind when comparing independent solar and stellar models. As an additional point I note that the preceding description is entirely local: it is assumed that \(F_{\mathrm{con}}\) is determined by conditions at a given point in the model, leading effectively to a relation of the form (9).
The motion of the convective elements also leads to transport of momentum which, when averaged, appears as a contribution to hydrostatic support in the form of a turbulent pressure of order
$$\begin{aligned} p_{\mathrm{t}}\sim \rho v^2 \sim {\rho \ell ^2 g \over H_p} (\nabla - \nabla _{\mathrm{ad}}) \; . \end{aligned}$$
(35)
Correspondingly, hydrostatic equilibrium, Eq. (1), is expressed in terms of \(p = p_{\mathrm{g}} + p_{\mathrm{t}}\), where \(p_{\mathrm{g}}\) is the thermodynamic pressure. On the other hand, the superadiabatic gradient \(\nabla - \nabla _{\mathrm{ad}}\) in Eqs. (32) and (35) is essentially a thermodynamic property and hence is determined by the gradient in \(p_{\mathrm{g}}\) or, if expressed in terms of p and \(p_{\mathrm{t}}\), the gradient of \(p_{\mathrm{t}}\). Consequently, including \(p_{\mathrm{t}}\) consistently in Eq. (1) increases the order of the system of differential equations within the convection zone, leading to severe numerical difficulties at the boundaries of the convection zone where the order changes (e.g., Stellingwerf 1976; Gough 1977b). A detailed analysis of the resulting singular points at the convection-zone boundaries was carried out by Gough (1977a). As a result, although the effect of the turbulent pressure on the hydrostatic structure has been included in some calculations based on a local treatment of convection (e.g., Henyey et al. 1965; Kosovichev 1995) \(\nabla - \nabla _{\mathrm{ad}}\) has generally been determined from the total pressure, thus avoiding the difficulties at the boundaries of the convection zone, but introducing some inconsistency (e.g. Baker and Gough 1979).
It is obvious that the local treatment of convection is an approximation, even in the simple physical picture employed here: a convective element senses conditions over a range of depths in the Sun during its motion; similarly, the convective flux at a given location must arise from an ensemble of convective elements originating at different depths. This indicates the need for a non-local description of convection, involving some averaging over the travel of a convective element and the elements contributing to the flux. Noting the similarity to the non-local nature of radiative transfer Spiegel (1963) proposed an approximation to this averaging akin to the Eddington approximation, leading to a set of local differential equations, albeit of higher order, to describe the convective properties (see also Gough 1977a). This was implemented by Balmforth and Gough (1991) and Balmforth (1992).Footnote 35 An advantage of the non-local formulation is that it bypasses the singularities caused by a consistent treatment of turbulent pressure in a local convection formulation; interestingly, Balmforth (1992) showed that the common inconsistent local treatment has a non-negligible effect on the properties of the model, compared with the local limit of the non-local treatment.
Alternative formulations for the convective properties have been developed on the basis of statistical descriptions of turbulence, thus including the full spectrum of convective eddies (e.g., Xiong 1977, 1989; Canuto and Mazzitelli 1991; Canuto et al. 1996) (for a more detailed discussion of such Reynolds stress models, see Houdek and Dupret 2015). Even so, the descriptions typically contain an adjustable parameter, most commonly related to a length scale, allowing the calibration of the surface radius of solar models.
A more physical description of convection is possible through numerical simulation (see, Nordlund et al. 2009; Freytag et al. 2012). In practice this is restricted to fairly limited regions near the stellar surface, and even then requires simplified descriptions of the behaviour on scales smaller than the numerical grid.Footnote 36 Detailed modelling, including radiative effects in the stellar atmosphere, has been carried out by, for example, Stein and Nordlund (1989, 1998) and Wedemeyer et al. (2004). This also includes treatments of the equation of state and opacity which are consistent with global stellar models and hence immediately allow comparison with such models. Magic et al. (2013) and Trampedach et al. (2013) presented extensive grids of simulations for a range of stellar parameters, covering the main sequence and the lower part of the red-giant branch.
The simulations provide an alternative to the usual simplified stellar atmosphere models, which are assumed to be time independent and homogeneous in the horizontal direction. A very interesting aspect is that spectral line profiles calculated from the simulations and suitably averaged are in excellent agreement with observations, without the conventional ad hoc inclusion of additional line broadening through ‘microturbulence’ (e.g., Asplund et al. 2000). Also, the simulations provide a very good fit to the observed solar limb darkening, i.e., the variation across the solar disk of the intensity (Pereira et al. 2013).
The simulations of solar near-surface convection typically extend sufficiently deeply to cover that part of the convection zone where the temperature gradient is substantially superadiabatic (see Fig. 12). Thus they essentially define the specific entropy of the adiabatic part of the convection zone and hence fix the depth of the convection zone. Rosenthal et al. (1999) utilized this by extending an averaged simulation by means of a mixing-length envelope. Interestingly, they found that the resulting convection-zone depth was essentially consistent with the depth inferred from helioseismology (cf. Sect. 5.1.2), thus indicating that the simulation had successfully matched the actual solar adiabat.
As a generalization of these investigations, the simulations can be included in stellar modelling through grids of atmosphere models or suitable parameterization of simple formulations. A convenient procedure is to determine an effective mixing-length parameter \(\alpha _{\mathrm{ML}}(T_{\mathrm{eff}}, g)\) as a function of effective temperature and surface gravity, such as to reproduce the entropy of the adiabatic part of the convection zone (e.g., Ludwig et al. 1999, 2008; Trampedach et al. 1999, 2014a; Magic et al. 2015). It should be noted that since \(\alpha _{\mathrm{ML}}\) determines the entropy jump from the atmosphere to the interior of the convection zone, this calibration is intimately tied to the assumed atmospheric structure, e.g., specified by a \(T(\tau )\) relation also obtained from the simulations (Trampedach et al. 2014b). As an example, Fig. 11 shows the calibrated \(\alpha _{\mathrm{ML}}\) obtained by Trampedach et al. (2014a), as a function of \(T_{\mathrm{eff}}\) and \(\log g\). Interestingly, the variation of \(\alpha _{\mathrm{ML}}\) is modest in the central part of the diagram, along the evolution tracks of stars close to solar. Preliminary evolution calculations using these calibrations were carried out by Salaris and Cassisi (2015) and Mosumgaard et al. (2017, 2018). A similar analysis based on the calibration of the mixing-length parameter was carried out by Spada et al. (2018). As an alternative to use the fitted mixing length, Jørgensen et al. (2017) developed a method to include in stellar modelling the averaged structure of the near-surface layers obtained by interpolating in a grid of simulations. This was used by Jørgensen et al. (2018) to calculate a solar-evolution model incorporating such averaged structure in all models along the evolution track; similarly, Mosumgaard et al. (2020) calculated stellar evolution tracks for a range of masses, including the interpolated simulations along the evolution.
Apart from the calibration to match the solar radius (cf. Sect. 2.6) tests of the mixing-length parameter and its possible dependence on stellar properties can be carried out by comparing observations and models of red giants, whose effective temperature depends on the assumed \(\alpha _{\mathrm{ML}}\) (Salaris et al. 2002). A recent analysis was carried out by Tayar et al. (2017) based on APOGEE and Kepler observations, comparing with models computed with the YREC code (van Saders and Pinsonneault 2012). The model fits indicated a significant dependence on stellar metallicity, with \(\alpha _{\mathrm{ML}}\) increasing with increasing metallicity. Interestingly, calibrations based on 3D simulations (Magic et al. 2015) did not show this trend, nor did the results obtained by Tayar et al. match the values obtained by Trampedach et al. (2014a), shown in Fig. 11. However, it should be recalled that the effect of \(\alpha _{\mathrm{ML}}\) on stellar structure depends on other parameters in the mixing-length treatment, as well as on the assumed atmospheric structure and physics of the near-surface layers. Thus comparison of numerical values of \(\alpha _{\mathrm{ML}}\) or trends with, e.g., metallicity requires some care; the discrepancies may be caused by differences in other aspects of the modelling. In fact, in a detailed analysis Salaris et al. (2018), carefully taking into account the other uncertainties in the modelling of the near-surface layers, were unable to reproduce the results of Tayar et al. (2017); on the other hand, they did find some issues when \(\alpha \)-enhanced stars were included in the sample.
A comparison between different formulations of near-surface convection is provided in Fig. 12, in a format introduced by Gough and Weiss (1976). The complete solar models, corresponding to Model S of Christensen-Dalsgaard et al. (1996), have been calibrated to the same solar radius (cf. Sect. 2.6) through the adjustment of suitable parameters; this yields a depth of the convection zone which is essentially consistent with the helioseismically determined value. Evidently, regardless of the convection treatment the region of substantial superadiabatic gradient \(\nabla - \nabla _{\mathrm{ad}}\) is confined to the near-surface layers, as would also be predicted from the simple analysis given above (cf. Eq. 32). Using the Canuto and Mazzitelli (1991) formulation leads to a rather higher and sharper peak in the superadiabatic gradient than for the Böhm-Vitense (1958) mixing-length formulation. On the other hand, it is striking that the detailed behaviour of the averaged superadiabatic gradient resulting from the Trampedach et al. (2013) simulation is in reasonable agreement with the results of the calibrated mixing-length treatment. As already noted, it also appears to lead to the correct adiabat in the deeper parts of the convection zone.
Physically realistic simulations of near-surface convection have been carried out extending over 96 Mm in the horizontal direction, thus for the first time also including the scale of supergranules, and to a depth of 20 Mm, around 10% of the convection zone (Stein et al. 2006, 2009; Nordlund and Stein 2009).Footnote 37 Simulations have also been carried out which cover the bulk of the convection zone, but excluding the near-surface region: it is very difficult to include the very disparate range of temporal and spatial scales needed to cover the entire convection zone. Also, the microphysics of such simulations are typically somewhat simplified. On the other hand, the simulations take rotation into account, in an attempt to model the transport of angular momentum and hence the source of the surface differential rotation (cf. Eq. 11) and the variation of rotation within the convection zone (see also Sect. 5.1.4). A detailed review of these simulations was provided by Miesch (2005). As an example of their relation to global solar structure, Fig. 12 includes the average superadiabatic gradient from such a simulation, appropriately located relative to the global models. Apart from boundary effects the simulation is clearly in relatively good agreement with the simplified treatment, in particular confirming that this part of the model is very nearly isentropic.
An interesting issue was raised by Hanasoge et al. (2012) concerning the validity of the deeper convection simulations: based on local helioseismology (see Gizon and Birch 2005) using the time distance technique they obtained estimates of the convective velocity one or two orders of magnitude lower than obtained in the simulations, or indeed predicted from the simple estimate in Eq. (32). This was questioned in an analysis using the ring-diagram technique (Greer et al. 2015), who obtained results similar to those of the simulations. However, Hanasoge et al. (2020) showed, using a helioseismic technique based on coupling of mode eigenfunctions, that large-scale turbulence in the Sun is strongly suppressed compared with the results of global numerical simulations. Thus there is increasing observational evidence for possible limitations in our understanding of the dynamics of convection in the Sun, in particularly at larger scales, where there is essentially no observational evidence for structured flows, unlike what is seen in global simulations of the solar convection zone (for a review, see Miesch 2005). A review of the helioseismic inferences of solar convection was provided by Hanasoge et al. (2016). Simulations by Cossette and Rast (2016) indicated that supergranules might be the largest coherent scales of convection, with energy transport in the deeper, essentially adiabatically stratified, parts of the convection zone being dominated by colder compact downflowing plumes. For a recent short review on solar convection, see Rast (2020).
Calibration of solar models
The Sun is unique amongst stars in that we have accurate determinations of its mass, radius and luminosity and an independent and relatively precise measure of its age from age determinations of meteorites (see Sect. 2.2). It is obvious that solar models should satisfy these constraints, as well as other observed properties of the Sun, particularly the present ratio between the abundances of heavy elements and hydrogen. Ideally, the constraints would provide tests of the models; in practice, the modelling includes a priori three unknown parameters which must be adjusted to match the observed properties: the initial hydrogen and heavy-element abundances \(X_0\) and \(Z_0\) and a parameter characterizing the efficacy of convection (see Sect. 2.5). This adjustment constitutes the calibration of solar models.
Some useful understanding of the sensitivity of the models to the parameters can be obtained from simple homology arguments (e.g., Kippenhahn et al. 2012). According to these, the luminosity approximately scales with mass and composition as
$$\begin{aligned} L \propto Z^{-1} (1 + X)^{-1} M^{5.5} \mu ^{7.5}, \end{aligned}$$
(36)
assuming Kramers opacity, with \(\kappa \propto Z (1 + X) \rho T^{-3.5}\), and with \(\mu \) given by Eq. (13). Obviously, the strong sensitivity to the average mean molecular weight means that relatively modest changes in the helium abundance can lead to the correct luminosity.
As discussed above, the efficacy of convection in the near-surface layers determines the specific entropy in the adiabatic part of the convection zone and hence the structure of the convection zone, thus controlling its extent and hence the radius of the model. (When the composition is fixed by obtaining the correct luminosity the extent of the radiative interior is largely determined.) With increasing efficacy the superadiabatic temperature gradient \(\nabla - \nabla _{\mathrm{ad}}\) required to transport the flux is decreased; hence the temperature in the convection zone is generally lower, the density (at given pressure) therefore higher, and the mass of the convection zone occupies a smaller volume, and hence a smaller extent in radius. Thus the radius of the model decreases with increasing efficacy. The actual reaction of the model is substantially more complex but leads to the same qualitative result.
As discussed in Sect. 2.5, the treatment of convection and hence the properties of the superadiabatic temperature gradient are typically obtained from the mixing-length treatment. According to Eqs. (32) and (34), assuming that \(F_{\mathrm{con}}\) carries most of the flux and is therefore essentially fixed, an increase in \(\alpha _{\mathrm{ML}}\) causes an increase in the convective efficacy and hence a decrease in \(\nabla - \nabla _{\mathrm{ad}}\), corresponding, according to the above argument, to a decrease in the model radius. Thus by adjusting \(\alpha _{\mathrm{ML}}\) a model with the correct radius can be obtained. In other simplified convection treatments, such as that of Canuto and Mazzitelli (1991), a similar efficiency parameter is typically introduced to allow radius calibration. When \(\alpha _{\mathrm{ML}}\) is obtained through fitting to 3D simulations (cf. Fig. 11) there is no a priori guarantee that this yields the value required to obtain the correct solar radius. In this case a correction factor can be applied to achieve the proper solar calibration (Mosumgaard et al. 2017). Of course, if the simulations provide a good representation of the outermost layers of the Sun, as already found to be the case by Rosenthal et al. (1999), this factor would be close to one, as has indeed been found in practice. The same correction factor is then applied when the fit to the 3D simulations are used for more general stellar modelling.
The details of the calibration depend on whether or not diffusion and settling are included. If these effects are ignored the surface composition of the model hardly changes between the zero-age main sequence and the present age of the Sun. Although the present surface abundance \(X_{\mathrm{s}}\) of hydrogen is affected by the calibration of \(X_0\) the range of variation is typically so small that it can be ignored, and the (constant, in space and time) value of the heavy-element abundance, and hence \(Z_0\), is fixed from \(Z_{\mathrm{s}}/X_{\mathrm{s}}\) and some suitable characteristic value of X. On the other hand, if diffusion and settling are included the change in the convection-zone composition must be taken into account and the value of \(Z_0\) must be adjusted to match properly \(Z_{\mathrm{s}}/X_{\mathrm{s}}\).
The formal calibration problem is then, when including diffusion and settling, to determine the set of parameters \(\{p_i\} = \{X_0, Z_0, \alpha _{\mathrm{ML}}\}\) to match the observables \(\{o_k\} = \{L_{\mathrm{s}}, Z_{\mathrm{s}}/X_{\mathrm{s}}, R\}\) to the solar values \(\{o_k^\odot \} = \{L_{\mathrm{s, \odot }}, (Z_{\mathrm{s}}/X_{\mathrm{s}})_\odot , R_{\odot }\}\). (Specifically, R is here taken to be the photospheric radius, defined at the point in the model where \(T = T_{\mathrm{eff}}\), the effective temperature.) This is greatly simplified by the fact that variations in the parameters generally are fairly limited. Thus in practice the corrections \(\{\delta p_i\}\) to the parameters can be found from the errors in the observables, using a fixed set of derivatives, as
$$\begin{aligned} \delta p_i = \sum _k (o_k^\odot -o_k) {\partial p_i \over \partial o_k}, \end{aligned}$$
(37)
where the derivatives \(\{\partial p_i / \partial o_k\}\) are obtained by varying the parameters in turn and inverting the resulting derivative matrix \(\{\partial o_k / \partial p_i\}\). I have found that the following values secure relatively rapid convergence of the iteration:
$$\begin{aligned} \begin{array}{ccc} \displaystyle {\partial \ln \alpha _{\mathrm{ML}}\over \partial \ln L_{\mathrm{s}}} = 1.15 &{} \displaystyle {\partial \ln \alpha _{\mathrm{ML}}\over \partial \ln R} = -4.70 &{} \displaystyle {\partial \ln \alpha _{\mathrm{ML}}\over \partial \ln (Z_{\mathrm{s}}/X_{\mathrm{s}})} = 0.148 \\ \displaystyle {\partial \ln X_0 \over \partial \ln L_{\mathrm{s}}} = -0.137 &{} \displaystyle {\partial \ln X_0 \over \partial \ln R} = -0.087 &{} \displaystyle {\partial \ln X_0 \over \partial \ln (Z_{\mathrm{s}}/X_{\mathrm{s}})} = -0.132 \\ \displaystyle {\partial \ln Z_0 \over \partial \ln L_{\mathrm{s}}} = -0.111 &{} \displaystyle {\partial \ln Z_0 \over \partial \ln R} = 0.275 &{} \displaystyle {\partial \ln Z_0 \over \partial \ln (Z_{\mathrm{s}}/X_{\mathrm{s}})} = 0.864. \end{array} \end{aligned}$$
(38)
These derivatives are incorporated in the ASTEC code (Christensen-Dalsgaard 2008) and allow efficient and automatic calculation of calibrated solar models. In the case where no iteration for \(Z_0\) is carried out the following values have been used:
$$\begin{aligned} \begin{array}{cc} \displaystyle {\partial \ln \alpha _{\mathrm{ML}}\over \partial \ln L_{\mathrm{s}}} = 1.17 &{} \displaystyle {\partial \ln \alpha _{\mathrm{ML}}\over \partial \ln R} = -4.75 \\ \displaystyle {\partial \ln X_0 \over \partial \ln L_{\mathrm{s}}} = -0.154 &{} \displaystyle {\partial \ln X_0 \over \partial \ln R} = -0.045. \end{array} \end{aligned}$$
(39)
Convergence to a relative precision of \(10^{-7}\) is typically obtained in 5–7 iterations.