Skip to main content

Solar structure and evolution

Abstract

The Sun provides a critical benchmark for the general study of stellar structure and evolution. Also, knowledge about the internal properties of the Sun is important for the understanding of solar atmospheric phenomena, including the solar magnetic cycle. Here I provide a brief overview of the theory of stellar structure and evolution, including the physical processes and parameters that are involved. This is followed by a discussion of solar evolution, extending from the birth to the latest stages. As a background for the interpretation of observations related to the solar interior I provide a rather extensive analysis of the sensitivity of solar models to the assumptions underlying their calculation. I then discuss the detailed information about the solar interior that has become available through helioseismic investigations and the detection of solar neutrinos, with further constraints provided by the observed abundances of the lightest elements. Revisions in the determination of the solar surface abundances have led to increased discrepancies, discussed in some detail, between the observational inferences and solar models. I finally briefly address the relation of the Sun to other similar stars and the prospects for asteroseismic investigations of stellar structure and evolution.

Introduction

The study of stellar properties and stellar evolution plays a central role in astrophysics. Observations of stars determine the chemical composition, age and distance of the varied components of the Milky Way Galaxy and hence form the basis for studies of Galactic evolution. Stellar abundances and their evolution, particularly for lithium, are also a crucial component of the study of Big-Bang nucleosynthesis. Understanding of the pulsational properties of Cepheids underlies their use as distance indicators and hence the basic unit of distance measurement in the Universe. The detailed properties of supernovae are important for the study of element nucleosynthesis, while supernovae of Type Ia are crucial for determining the large-scale properties of the Universe, including the evidence for a dominant component of ‘dark energy’. In all these cases an accurate understanding, and modelling, of stellar interiors and their evolution is required for reliable results.

Modelling stellar evolution depends on a detailed treatment of the physics of stellar interiors. Insofar as the star is regarded as nearly spherically symmetric the basic equations of stellar equilibrium are relatively straightforward (see Sect. 2.1), but the detailed properties, often referred to as microphysics, of matter in a star are extremely complex, yet of major importance to the modelling. This includes the thermodynamical properties, as specified by the equation of state, the interaction between matter and radiation described by the opacity, the nuclear processes generating energy and causing the evolution of the element composition, and the diffusion and settling of elements. Equally important are potential hydrodynamical processes caused by various instabilities which may contribute to the transport of energy and material, hence causing partial or full mixing of given regions in a star. It is obvious that sufficiently detailed observations of stellar properties, and comparison with models, may provide a possibility for testing the physics used in the model calculation, hence allowing investigations of physical processes far beyond the conditions that can be reached in a terrestrial laboratory.

Amongst stars, the Sun obviously plays a very special role, both to our daily life and as an astrophysical object. Its proximity allows very precise, and probably accurate, determination of its global parameters, as well as extremely detailed investigations of phenomena in the solar atmosphere, compared with other stars. Indications are that the Sun is typical for its mass and age (e.g., Gustafsson 1998; Robles et al. 2008; Strugarek et al. 2017),Footnote 1 although a detailed analysis by Reinhold et al. (2020) of photometric variability observed with the Kepler spacecraft indicated that solar magnetic activity may be rather low compared with solar-like stars. Also, conditions in the solar interior are relatively benign, providing some hope that reasonably realistic modelling can be carried out. Thus it is an ideal case for investigations of stellar structure and evolution. Interestingly, there still remain very significant discrepancies between the observed properties of the Sun and solar models.

A good overview of the development of the study of stellar structure and evolution was provided by Tassoul and Tassoul (2004). Also, Shaviv (2009) gave an excellent wide-ranging and deep description of the evolution of the field, including an extensive discussion of the relevant observational basis, the underlying physics, and related aspects, such as the early tension between estimates of the age of the Earth and the Sun. The application of physics to the understanding of stellar interiors developed from the middle of the nineteenth century. The first derivation of stellar models based on mechanical equilibrium was carried out by Lane (1870).Footnote 2 Further development of the theory of such models, summarized in an extensive bibliographical note by Chandrasekhar (1939), was carried out by Ritter, Lord Kelvin and others, culminating in the monograph by Emden (1907). These models were based on the condition of hydrostatic equilibrium, combined with a simplified, so-called polytropic, equation of state. Major advances came with the application of the theory of radiative transfer, and quantum-mechanical calculations of atomic absorption coefficients, to the energy transport in stellar interiors. This allowed theoretical estimates to be made of the relation between the stellar mass and luminosity, even without detailed knowledge about the stellar energy sources (for a masterly discussion of these developments, see Eddington 1926). Further investigations of the properties of stellar opacity led to the conclusion that stellar matter was dominated by hydrogen (Strömgren 1932, 1933), in agreement with the detailed determination of the composition of the solar photosphere by Russell (1929), as well as with the analysis of a broad range of stars by Unsöld (1931). Although stellar modelling had proceeded up to this point without any definite information about the sources of stellar energy, this issue was evidently of very great interest. As early as 1920 Eddington (1920) and others noted that the fusion of hydrogen into helium might produce the required energy, over the solar lifetime, but a mechanism making the fusion possible, given the strong Coulomb repulsion between the nuclei, was lacking. This mechanism was provided by Gamow’s development of the treatment of quantum-mechanical barrier penetration between reacting nuclei, resulting in the identification of the dominant reactions in hydrogen fusion through the PP chains and the CNO cycle (cf. Sect. 2.3.3) by von Weizsäcker (1937, 1938), Bethe and Chritchfield (1938) and Bethe (1939). With this, the major ingredients required for the modelling of the solar interior and evolution had been established.

An important aspect of solar structure is the presence of an outer convection zone. Following the introduction by Schwarzschild (1906) of the criterion for convective instability in stellar atmospheres, Unsöld (1930) noted that such instability would be expected in the lower photosphere of the Sun. As a very important result, Biermann (1932) noted that the temperature gradient resulting from the consequent convective energy transport would in general be close to adiabatic; as a result, the structure of the convection zone depends little on the details of the convective energy transport. Also, he found that the resulting convective region in the Sun extended to very substantial depths, reaching a temperature of \(10^7\,\,\mathrm{K}\). Further calculations by, for example, Biermann (1942) and Rudkjøbing (1942), taking into account more detailed models of the solar atmosphere, generally confirmed these results. In an interesting short paper Strömgren (1950) summarized these early results. He noted that the presence of \({}^7\mathrm{Li}\) in the solar atmosphere clearly showed that convective mixing could extend at most to a temperature of \(3.5 \times 10^6 \,\mathrm{K}\),Footnote 3 beyond which lithium would be destroyed by nuclear reactions. He also pointed out that a revision of determinations of the composition of the solar atmosphere, relative to the one assumed by Biermann, had reduced the heavy-element abundance and that this would reduce the temperature at the base of the convection zone to the acceptable value of \(2.5 \times 10^6 \,\mathrm{K}\). Although these models are highly simplified, the use of the lithium abundance as a constraint on the extent of convective mixing, and the effect of a composition adjustment on the convection-zone depth, remain highly relevant, as discussed below.

Specific computations of solar models must satisfy the known observational constraints for the Sun, namely that solar radius and luminosity be reached at solar age, for a \(1 M_\odot \) model. As discussed by Schwarzschild et al. (1957) this can be achieved by adjusting the composition and the characteristics of the convection zone. They noted that no independent determination of the initial hydrogen and helium abundances \(X_0\) and \(Y_0\) was possible and consequently determined models for specified initial values of the hydrogen abundance. The convection zone was assumed to have an adiabatic stratification and to consist of fully ionized material, such that it was characterized by the adiabatic constant K in the relation \(p = K \rho ^{5/3}\) between pressure p and density \(\rho \). Given \(X_0\) the values of \(Y_0\) and K were then determined to obtain a model with the correct luminosity and radius. Although since substantially refined, this remains the basic principle for the calibration of solar models (see Sect. 2.6). A detailed discussion of the calibration of the properties of the convection zone was provided by Gough and Weiss (1976).

Given the calibration, the observed mass, radius and luminosity clearly provide no test of the validity of the solar model. An important potential for testing solar models became evident with the realization (Fowler 1958) that nuclear reactions in the solar core produce huge numbers of neutrinos which in principle may be measured, given a suitable detector (Davis 1964; Bahcall 1964). The first results of a large-scale experiment (Davis et al. 1968) surprisingly showed an upper limit to the neutrino flux substantially below the predictions of the then current solar models. Further experiments using a variety of techniques, and additional computations, did not eliminate this discrepancy, the predictions being higher by a factor 2–3 than the experiment, until the beginning of the present millennium.

An independent way of testing solar models, with potentially much higher selectivity, became available with the detection of solar oscillations (see Christensen-Dalsgaard 2004, for further details on the history of the field). Oscillations with periods near 5 min were discovered by Leighton et al. (1962). Their character as standing acoustic waves was proposed independently by Ulrich (1970) and Leibacher and Stein (1971), leading also to the expectation that their frequencies could be used to probe the outer parts of the Sun. This identification was confirmed observationally by Deubner (1975), whose data clearly showed the modal character of the oscillations. The observed modes had short horizontal wavelength and extended only a few per cent into the Sun. Indications of global oscillations in the solar diameter were presented by Hill et al. (1976), immediately suggesting that detailed information about the whole solar interior could be obtained from analysis of their frequencies (e.g., Christensen-Dalsgaard and Gough 1976). Although Hill’s data have not been confirmed by later studies, they served as important inspirations for such studies, now known as helioseismology.Footnote 4

Early analyses of the short-wavelength five-minute oscillations (Gough 1977c; Ulrich and Rhodes 1977) showed that the solar convection zone was substantially deeper than in the models of the epoch. A major breakthrough was the detection of global five-minute oscillations by Claverie et al. (1979) and Grec et al. (1980) and the subsequent identification of modes in the five-minute band over a broad range of horizontal wavelengths (Duvall and Harvey 1983). Observations of these modes have formed the basis for the dramatic development of helioseismology over the last three decades. With the increasing precision and detail of the observed oscillation frequencies, increasing sophistication was applied to solar modelling, generally leading to improved agreement between models and observations. Important examples were the realization that the opacity of the solar interior should be increased to match the inferred sound-speed profile (Christensen-Dalsgaard et al. 1985), that sophisticated equations of state were required to match the observed frequencies (Christensen-Dalsgaard et al. 1988), and that the inclusion of diffusion and settling substantially improved the agreement between the models and the Sun (Christensen-Dalsgaard et al. 1993). Remarkably, these developments in the model physics, motivated by but not directly fitted to, the steadily improving observations, led to models in good overall agreement with the inferred solar structure (e.g., Christensen-Dalsgaard et al. 1996; Gough et al. 1996; Bahcall et al. 1997; Brun et al. 1998). The remaining discrepancies were highly significant and clearly required changes to the physics of the solar interior, however. Interestingly, later revisions of the measured solar surface abundance now result in rather larger discrepancies between models and observations, indicating that more basic modifications to the modelling may be required.

In the present review I provide an overview of these issues, covering both the modelling and the sensitivity of solar models to the physical assumptions and the inferences drawn from various observations and their interpretation. Section 2 presents the tools required to model the Sun and its evolution, including some emphasis on the underlying physical properties of solar matter. In Sect. 3 I present a brief overview of the evolution of a solar-mass star. A detailed discussion of the sensitivity of solar models to changes in the model parameters or physics is provided in Sect. 4, using as reference case the widely used so-called Model S (Christensen-Dalsgaard et al. 1996). Section 5 discusses the observations available to test our understanding of solar structure and evolution, i.e., helioseismology, solar neutrinos and the details of the solar surface composition; in discussing the helioseismic results a brief presentation of results on solar internal rotation is also provided. In Sect. 6 the serious issues raised by the revised determinations of the solar composition after 2000 are discussed in detail, including the revisions to solar modelling which have attempted to obtain agreement with the helioseismically inferred structure under the constraints of these revised abundances. Finally, Sect. 7 gives a very brief presentation of studies of other stars, including the place of the Sun in relation to solar-like stars, and Sect. 8 provides a few concluding remarks. In support of the numerical results provided here, the Appendix briefly addresses the important issue of the numerical accuracy of the computed models.

Modelling the Sun

Basics of stellar modelling

Stellar models are generally calculated under a number of simplifying approximations, of varying justification. In most cases rotation and other effects causing departures from spherical symmetry are neglected and hence the star is regarded as spherically symmetric. Also, with the exception of convection, hydrodynamical instabilities are neglected, while convection is treated in a highly simplified manner. The mass of the star is assumed to be constant, so that no significant mass loss is included. In contrast to these simplifications of the ‘macrophysics’ the microphysics is included with considerable, although certainly inadequate, detail. In recent calculations effects of diffusion and settling are typically included, at least in computations of solar models. The result of these approximations is what is often called a ‘standard solar model’, although still obviously depending on the assumptions made in the details of the calculation.Footnote 5 Even so, such models computed independently, with recent formulations of the microphysics, give rather similar results. In this paper I generally restrict the discussion to standard models, although discussing the effects of some of the generalizations. It might be noted that the present Sun is in fact one case where the standard assumptions may have some validity: at least the Sun rotates sufficiently slowly that direct dynamical effects of rotation are likely to be negligible. On the other hand, rotation was probably faster in the past and the loss and redistribution of angular momentum may well have led to instabilities and hence mixing affecting the present composition profile.

With the assumption of spherical symmetry the model is characterized by the distance r to the centre. Hydrostatic equilibrium requires a balance between the pressure gradient and gravity which may then be written as

$$\begin{aligned} {\mathrm{d}p \over \mathrm{d}r} = - {G m \rho \over r^2}, \end{aligned}$$
(1)

where p is pressure, \(\rho \) is density, m is the mass of the sphere contained within r, and G is the gravitational constant. Also, obviously,

$$\begin{aligned} {\mathrm{d}m \over \mathrm{d}r} = 4 \pi r^2 \rho . \end{aligned}$$
(2)

The energy equation relates the energy generation to the energy flow and the change in the internal energy of the gas:

$$\begin{aligned} {\mathrm{d}L \over \mathrm{d}r} = 4 \pi r^2 \left[ \rho \epsilon - \rho {\mathrm{d}\over \mathrm{d}t }\left( {e \over \rho }\right) + {p \over \rho }{\mathrm{d}\rho \over \mathrm{d}t }\right] ; \end{aligned}$$
(3)

here L is the energy flow through the surface of the sphere of radius r, \(\epsilon \) is the rate of nuclear energy generationFootnote 6 per unit mass and unit time, e is the internal energy per unit volume and t is time.Footnote 7 The gradient of temperature T is determined by the requirements of energy transport, from the central regions where nuclear reactions take place to the surface where the energy is radiated. The temperature gradient is conventionally written in terms of \(\nabla = \mathrm{d}\ln T / \mathrm{d}\ln p\) as

$$\begin{aligned} {\mathrm{d}T \over \mathrm{d}r} = \nabla {T \over p} {\mathrm{d}p \over \mathrm{d}r}. \end{aligned}$$
(4)

The form of \(\nabla \) depends on the mode of energy transport; for radiative transport in the diffusion approximation

$$\begin{aligned} \nabla = \nabla _{\mathrm{rad}}\equiv {3 \over 16 \pi a {\tilde{c}} G} {\kappa p \over T^4}{L(r) \over m (r)}, \end{aligned}$$
(5)

where \(\kappa \) is the opacity, a is the radiation energy density constant and \({\tilde{c}}\) is the speed of light. Finally, we need to consider the rate of change of the composition, which controls stellar evolution. In a main-sequence star such as the Sun the dominant effect is the burning of hydrogen; however, we must also take into account the changes in composition resulting from diffusion and settling. The rate of change of the abundance \(X_i\) by mass of element i is therefore given by

$$\begin{aligned} {\partial X_i \over \partial t} = {{\mathcal {R}}}_i + {1 \over r^2 \rho } {\partial \over \partial r} \left[ r^2 \rho \left( D_i {\partial X_i \over \partial r} + V_i X_i \right) \right] , \end{aligned}$$
(6)

where \({{\mathcal {R}}}_i\) is the rate of change resulting from nuclear reactions, \(D_i\) is the diffusion coefficient and \(V_i\) is the settling velocity.

To these basic equations we must add the treatment of the microphysics. This is discussed in Sect. 2.3 below.

I have so far ignored the convective instability. This sets in if the density decreases more slowly with position than for an adiabatic change, i.e.,

$$\begin{aligned} {\mathrm{d}\ln \rho \over \mathrm{d}\ln p} < {1 \over \varGamma _1}, \end{aligned}$$
(7)

where \(\varGamma _1 = (\partial \ln p / \partial \ln \rho )_{\mathrm{ad}}\), the derivative being taken for an adiabatic change. In stellar modelling this condition is often replaced by

$$\begin{aligned} {\mathrm{d}\ln T \over \mathrm{d}\ln p} \equiv \nabla > \nabla _{\mathrm{ad}} \equiv \left( \mathrm{d}\ln T \over \mathrm{d}\ln p \right) _{\mathrm{ad}}, \end{aligned}$$
(8)

which is equivalent in the case of a uniform composition.Footnote 8 Thus a layer is convectively unstable if the radiative gradient \(\nabla _{\mathrm{rad}}\) (cf. Eq. 5) exceeds \(\nabla _{\mathrm{ad}}\). In this case convective motion sets in, with hotter gas rising and cooler gas sinking, both contributing to the energy transport towards the surface. The structure of the convective flow should clearly be such that the combined radiative and convective energy transport at any point in the convection zone match the luminosity. The conditions in stellar interiors are such that complex, possibly turbulent, flows are expected over a broad range of scales (e.g., Schumacher and Sreenivasan 2020). Also, the convective flux at a given location obviously represents conditions over a range of positions in the star, sampled by a moving convective eddy, so that convective transport is intrinsically non-local. As a related issue, motion is inevitably induced outside the immediate unstable region, also potentially affecting the energy transport and structure, although this is often ignored. However, in computations of stellar evolution these complexities are almost always reduced to a grossly simplified local description which allows the computation of the average temperature gradient in terms of local conditions, as

$$\begin{aligned} \nabla = \nabla _{\mathrm{conv}}(\rho , T, L, \ldots ), \end{aligned}$$
(9)

applied in regions of convective instability (see Sect. 2.5).

The equations are supplemented by boundary conditions. The centre, which is a regular singular point, can be treated through a series expansion in r. For example, it follows from Eq. (2) for the mass and Eq. (1) of hydrostatic support that

$$\begin{aligned} m = {4 \over 3} \pi \rho _{\mathrm{c}} r^3 + \cdots , \quad p = p_{\mathrm{c}} - {2 \over 3} \pi \rho _{\mathrm{c}}^2 r^2 + \cdots , \end{aligned}$$
(10)

where \(\rho _{\mathrm{c}}\) and \(p_{\mathrm{c}}\) are the central density and pressure. A discussion of the expansions to second significant order in r, and techniques for incorporating them in the central boundary conditions, was given by Christensen-Dalsgaard (1982). At the surface, the model must include the stellar atmosphere. Since this requires a more complex description of radiative transfer than provided by the diffusion approximation (Eq. 33), separately calculated detailed atmospheric models are often matched to the interior solution, thus effectively providing the surface boundary condition. Simpler alternatives are discussed in Sect. 2.4.

The equations and boundary conditions are most often solved using finite-difference methods, by what in the stellar-evolution community is known as the Henyey technique (e.g., Henyey et al. 1959, 1964).Footnote 9 This was discussed in some detail by Clayton (1968) and Kippenhahn et al. (2012). The presence of the time dependence, in the energy equation and the description of the composition evolution, is an additional complication. The detailed implementation in the Aarhus STellar Evolution Code (ASTEC), used in the following to compute examples of solar models, was discussed in some detail by Christensen-Dalsgaard (2008).

An important issue is the question of numerical accuracy, in the sense of providing an accurate solution to the problem, given the assumptions about micro- and macrophysics. It is evident that the accuracy must be substantially higher than the effects of, for example, those potential errors in the physics which are investigated through comparisons between the models and observations. Ab initio analyses of the computational errors are unlikely to be useful, given the complexity of the equations. As discussed in the Appendix, computations with differing spatial and temporal resolution provide estimates of the intrinsic precision of the calculation. Additional tests, which may also uncover errors in programming, are provided by comparisons between independently computed models, with carefully controlled identical physics (e.g., Gabriel 1991; Christensen-Dalsgaard and Reiter 1995; Lebreton et al. 2008; Monteiro 2008).

Basic properties of the Sun

The Sun is unique amongst stars in that its global parameters can be determined with high precision. From planetary motion the product \(G M_\odot \) of the gravitational constant and the solar mass is know with very high accuracy, as \(1.32712438 \times 10^{26} \,\mathrm{cm}^3 \,\mathrm{s}^{-2}\). Even though G is the least precisely determined of the fundamental constants this still allows the solar mass to be determined with a precision far exceeding the precision of the determination of other stellar masses. The 2014 recommendations of CODATAFootnote 10 (Mohr et al. 2016) give a value \(G = 6.67408 \pm 0.00031 \times 10^{-8}\,\,\mathrm{cm}^3\,\,\mathrm{g}^{-1}\,\,\mathrm{s}^{-2}\), corresponding to \(\,M_\odot = 1.98848 \times 10^{33} \,\mathrm{g}\). However, the solar mass has traditionally been taken to be \(\,M_\odot = 1.989 \times 10^{33} \,\mathrm{g}\), corresponding to \(G = 6.672320 \times 10^{-8}\,\,\mathrm{cm}^3\,\,\mathrm{g}^{-1}\,\,\mathrm{s}^{-2}\); in the calculations reported in the present paper I use the latter values of \(\,M_\odot \) and G, even though these are not entirely consistent with the CODATA 2014 recommendations. I note that Christensen-Dalsgaard et al. (2005) found that variations to G and \(\,M_\odot \), keeping their product fixed, had very small effects on the resulting solar models.

The angular diameter of the Sun can be determined with very substantial precision, although the level in the solar atmosphere to which the value refers obviously has to be carefully specified. From such measurements, and the known mean distance between the Earth and the Sun, the solar photospheric radius, referring to the point where the temperature equals the effective temperature, has been determined as \(6.95508 \pm 0.00026 \times 10^{10} \,\mathrm{cm}\) by Brown and Christensen-Dalsgaard (1998); this was adopted by Cox (2000). Haberreiter et al. (2008) obtained the value \(6.95658 \pm 0.00014 \times 10^{10} \,\mathrm{cm}\), which within errors is consistent with the value of Brown and Christensen-Dalsgaard (1998). However, most solar modelling has used the older value \(\,R_\odot = 6.9599 \times 10^{10} \,\mathrm{cm}\) (Auwers 1891), as quoted, for example, by Allen (1973); thus, for most of the models presented here I use this value.

From bolometric measurements of the solar ‘constant’ from space the total solar luminosity can be determined, given the Sun-Earth distance, if it is assumed that the solar flux is independent of latitude; although no evidence has been found to question this assumption, it is perhaps of some concern that measurements of the solar irradiance have only been made close to the ecliptic plane. An additional complication is provided by the variation in solar irradiance with phase in the solar cycle of around 0.1%, peak to peak (for a review, see Fröhlich and Lean 2004); since the cause of this variation is uncertain it is difficult to estimate the appropriate luminosity corresponding to equilibrium conditions. The value \(\,L_\odot = 3.846 \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\) (obtained from the average irradiance quoted by Willson 1997) has often been used and will generally be applied here. However, recently Kopp et al. (2016) has obtained a revised irradiance, as an average over solar cycle 23, leading to \(\,L_\odot = 3.828 \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\).

The solar radius and luminosity are often used as units in characterizing other stars, although with some uncertainty about the precise values that are used. In 2015 this led to Resolution B3 of the International Astronomical UnionFootnote 11 (see Mamajek et al. 2015; Prša et al. 2016), defining the nominal solar radius \({{\mathcal {R}}}_\odot ^N = 6.957 \times 10^8 \,\mathrm{m}\), suitably rounded from the value obtained by Haberreiter et al. (2008), and the nominal solar luminosity \({{\mathcal {L}}}_\odot ^N = 3.828 \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\) from Kopp et al. (2016).

The solar age \(t_\odot \) can be estimated from radioactive dating of meteorites combined with a model of the evolution of the solar system, relating the formation of the meteorites to the arrival of the Sun on the main sequence. Detailed discussions of meteoritic dating were provided by Wasserburg, in Bahcall and Pinsonneault (1995), and by Connelly et al. (2012). Wasserburg found \(t_\odot = 4.570 \pm 0.006 \times 10^9\)  years, with very similar although more accurate values obtained by Connelly et al. Uncertainties in the modelling of the early solar system obviously affect how this relates to solar age. For simplicity, in the following I simply identify this age with the time since the arrival of the Sun on the main sequence.Footnote 12 Despite the remaining uncertainty this still provides an independent measure of a stellar age of far better accuracy than is available for any other star.

The solar surface abundance can be determined from spectroscopic analysis (for reviews, see Asplund 2005; Asplund et al. 2009). Additional information about the primordial composition of the solar system, and hence likely the Sun, is obtained from analysis of meteorites. A major difficulty is the lack of a reliable determination from spectroscopy of the solar helium abundance. Lines of helium, an element then not known from the laboratory, were first detected in the solar spectrum;Footnote 13 however, these lines are formed under rather uncertain, and very complex, conditions in the upper solar atmosphere, making an accurate abundance determination from the observed line strengths infeasible; the same is true of other noble gases, with neon being a particularly important example. For those elements with lines formed in deeper parts of the atmosphere the spectroscopic analysis yields reasonably precise abundance determinations (e.g., Allende Prieto 2016); however, given that the helium abundance is unknown these are only relative, typically specified as a fraction of the hydrogen abundance. Detailed analyses were provided by Anders and Grevesse (1989) and Grevesse and Noels (1993), the latter leading to a commonly used present ratio \(Z_{\mathrm{s}}/X_{\mathrm{s}}= 0.0245\) between the surface abundances \(X_{\mathrm{s}}\) and \(Z_{\mathrm{s}}\) by mass of hydrogen and elements heavier than helium, respectively. Also, for most refractory elements there is good agreement between the solar abundances and those inferred from primitive meteorites. A striking exception is the abundance of lithium which has been reduced in the solar photosphere by a factor of around 150, relative to the meteoritic abundance (Asplund et al. 2009). This is presumably the result of lithium destruction by nuclear reaction, which would take place to the observed extent over the solar lifetime at a temperature of around \(2.5 \times 10^6 \,\mathrm{K}\), indicating that matter currently at the solar surface has been mixed down to this temperature. On the other hand, the abundance of beryllium, which would be destroyed at temperatures above around \(3.5 \times 10^6 \,\mathrm{K}\), has apparently not been significantly reduced relative to the primordial value (Balachandran and Bell 1998; Asplund 2004), so that significant mixing has not reached this temperature. These abundance determinations obviously provide interesting constraints on mixing processes in the solar interior during solar evolution (see Sect. 5.3).

Since 2000 major revisions of solar abundance determinations have been carried out, through the use of three-dimensional (3D) hydrodynamical simulations of the solar atmosphere (Nordlund et al. 2009, see also Sect. 2.5). This resulted in a substantial decrease in the inferred abundances of, in particular, oxygen, carbon and nitrogen (for a summary, see Asplund et al. 2009), resulting in \(Z_{\mathrm{s}}/X_{\mathrm{s}}= 0.0181\). The resulting decrease in the opacity in the radiative interior has substantial consequences for solar models and their comparison with helioseismic results; I return to this in Sect. 6.

Observations of the solar surface show that the Sun is rotating differentially, with an angular velocity that is highest at the equator. This was evident already quite early from measurements of the apparent motion of sunspots across the solar disk (Carrington 1863), and has been observed also in the Doppler velocity of the solar atmosphere. In an analysis of an extended series of Doppler measurements, Ulrich et al. (1988) obtained the surface angular velocity \(\varOmega \) as

$$\begin{aligned} {\varOmega \over 2 \pi } = (415.5 - 65.3 \cos ^2 \theta - 66.7 \cos ^4 \theta ) \, \mathrm{nHz} \end{aligned}$$
(11)

as a function of co-latitude \(\theta \), corresponding to rotation periods of 25.6 d at the equator and 31.7 d at a latitude of \(60^\circ \).

As discussed in Sect. 5.1, helioseismology has provided very detailed information about the properties of the solar interior. Here I note that the depth of the solar convection zone has been determined as 0.287R, with errors as small as 0.001R (e.g., Christensen-Dalsgaard et al. 1991; Basu and Antia 1997). Also, the effect of helium ionization on the sound speed in the outer parts of the solar convection zone allows a determination of the solar envelope helium abundance \(Y_{\mathrm{s}}\), although with some sensitivity to the equation of state; the results are close to \(Y_{\mathrm{s}} = 0.25\) (e.g., Vorontsov et al. 1991; Basu 1998).

Microphysics

Within the framework of ‘standard solar models’ most of the complexity in the calculation lies in the determination of the microphysics, and hence very considerable effort has gone into calculations of the relevant physics. In comparing the resulting models with observations, particularly helioseismic inferences, to test the validity of these physical results one must, however, obviously keep in mind potential errors in the approximations defining the standard models.

In this section I provide a relatively brief discussion of the various formulations that have been used for the physics. To illustrate some of the effects comparisons are made based on the structure of the present Sun discussed in more detail in Sect. 4 below. A detailed discussion of the physics of stellar interiors was provided by Cox and Giuli (1968) and updated by Weiss et al. (2004); for a concise review of the treatment of the equation of state and opacity, see Däppen and Guzik (2000).

Equation of state

The thermodynamic properties of stellar matter, defined by the equation of state, play a crucial role in stellar modelling. This directly involves the relation between pressure, density, temperature and composition. In addition, the adiabatic compressibility \(\varGamma _1\) affects the adiabatic sound speed (cf. Eq. 55) and hence the oscillation frequencies of the star, whereas other thermodynamic derivatives are important in the treatment of convective energy transport.

The treatment of the equation of state involves the determination of all relevant thermodynamic quantities, for example defined as functions of \((\rho , T,\{X_i\})\), where \(X_i\) are the abundances of the relevant elements; the composition is often characterized by the abundances X, Y and Z by mass of hydrogen, helium and heavier elements with, obviously, \(X + Y + Z = 1\). This should take into account the interaction between the different constituents of the gas, including partial ionization. Also, pressure and internal energy from radiation must be included, although they play a comparatively minor role in the Sun. An important constraint on the treatment is that it be thermodynamically consistent such that all thermodynamic relations are satisfied between the computed quantities (e.g., Däppen 1993). Thus it would not, for example, be consistent to add the contribution of Coulomb effects to pressure and internal energy without making corresponding corrections to other quantities, including the thermodynamical potentials that control the ionization.

A particular problem concerns ionization in the solar core. As pointed out by, e.g., Christensen-Dalsgaard and Däppen (1992) straightforward application of the Saha equation would predict a substantial degree of recombination of hydrogen at the centre of the Sun, yet the volume available to each hydrogen nuclei does not allow this. In fact, ionization must be largely controlled by interactions between the constituents of the gas, not included in the Saha equation, and often somewhat misleadingly denoted pressure ionization. These effects are taken into account in formulations of the equation of state at various levels of detail, generally showing that ionization is almost complete in the solar core. The simplest approach, which is certainly not thermodynamically consistent, is to enforce full ionization above a certain density or pressure.

A simple approximation to the solar equation of state is that of a fully ionized ideal gas, according to which

$$\begin{aligned} p \simeq {k_{\mathrm{B}}\rho T \over \mu m_{\mathrm{u}}}, \quad \nabla _{\mathrm{ad}}\simeq 2/5, \quad \varGamma _1 \simeq 5/3; \end{aligned}$$
(12)

here \(k_{\mathrm{B}}\) is Boltzmann’s constant, \(m_{\mathrm{u}}\) is the atomic mass unit and \(\mu \) is the mean molecular weight which can be approximated by

$$\begin{aligned} \mu = {4 \over 3 + 5 X - Z}. \end{aligned}$$
(13)

However, departures from this simple relation must obviously be taken into account in solar modelling. The most important of this is partial ionization, particularly relatively near the surface where hydrogen and helium ionize. Figure 1 shows the fractional ionization in a model of the present Sun. As discussed in Sect. 5.1.2 the effects of the ionization of helium on \(\varGamma _1\) provides a strong diagnostics of the solar envelope helium abundance.

Fig. 1
figure1

Fractional ionization in a model of the present Sun (Model S; see Sect. 4.1), as a function of the logarithm of the temperature (in K; bottom) and of fractional radius (top). The ionization was calculated with the CEFF equation of state (see below). The solid curve shows the fraction of ionized hydrogen, the dashed and dot-dashed curves the fraction of singly and fully ionized helium, respectively, and the dotted curve shows the average degree of ionization of the heavy elements

Other effects are smaller but highly significant, particularly given the high precision with which the solar interior can be probed with helioseismology. Radiation pressure, \(p_{\mathrm{rad}} = 1/3 a T^4\), and other effects of radiation are small but not entirely negligible. Coulomb interactions between particles in the gas need to be taken into account; a measure of their importance is given by

$$\begin{aligned} \varGamma _{\mathrm{e}} = {e^2 \over d_{\mathrm{e}} k_{\mathrm{B}}T }, \quad \hbox {with} \quad d_{\mathrm{e}} = \left( {3 \over 4 \pi n_{\mathrm{e}}} \right) ^{1/3}, \end{aligned}$$
(14)

which determines the ratio between the average Coulomb and thermal energy of an electron; here e is the charge of an electron, and \(d_{\mathrm{e}}\) is the average distance between the electrons, \(n_{\mathrm{e}}\) being the electron density per unit volume. Also, in the core effects of partial electron degeneracy must be included; the importance of degeneracy is measured by

$$\begin{aligned} \zeta _{\mathrm{e}} = \lambda _{\mathrm{e}}^3 n_{\mathrm{e}} = {4 \over \sqrt{\pi }} F_{1/2} (\psi ) \simeq 2 e^\psi , \end{aligned}$$
(15)

where

$$\begin{aligned} \lambda _{\mathrm{e}} = {h \over (2 \pi m_{\mathrm{e}} k_{\mathrm{B}}T)^{1/2}} \end{aligned}$$
(16)

is the de Broigle wavelength of an electron, h being Planck’s constant and \(m_{\mathrm{e}}\) the mass of an electron. In Eq. (15) \(\psi \) is the electron degeneracy parameter and \(F_\nu (\psi )\) is the Fermi integral,

$$\begin{aligned} F_\nu (y) = \int _0^\infty {x^\nu \over 1 + \exp (y+x)} \mathrm{d}x. \end{aligned}$$
(17)

The last approximation in Eq. (15) is valid for small degeneracy, \(\psi \ll -1\); in this case the correction to the electron pressure \(p_{\mathrm{e}}\), relative to the value for an ideal non-degenerate electron gas, is

$$\begin{aligned} {p_{\mathrm{e}} \over n_{\mathrm{e}} k_{\mathrm{B}}T} -1 \simeq 2^{-5/2} e^\psi \simeq 2^{-7/2} \zeta _e \end{aligned}$$
(18)

(see also Chandrasekhar 1939). Finally, the mean thermal energy of an electron is not negligible compared with the rest-mass energy of the electron near the solar centre, so relativistic effects should be taken into account; their importance is measured by

$$\begin{aligned} x_{\mathrm{e}} = {k_{\mathrm{B}}T \over m_{\mathrm{e}} {\tilde{c}}^2}; \end{aligned}$$
(19)

at the centre of the present Sun \(x_{\mathrm{e}} \simeq 0.0026\). As an important example, the relativistic effects cause a change

$$\begin{aligned} {\delta \varGamma _1 \over \varGamma _1} \simeq - {2 + 2X \over 3 +5 X} x_{\mathrm{e}} \end{aligned}$$
(20)

in \(\varGamma _1\), which is readily detectable from helioseismic analyses (Elliott and Kosovichev 1998).

The magnitude of these departures from a simple ideal gas are summarized in Fig. 2, for a standard solar model. Given the precision of helioseismic inferences, none of the effects can be ignored. Coulomb effects are relatively substantial throughout the model, although peaking near the surface. Inclusion of these effects, in the so-called MHD equation of state (see below) was shown by Christensen-Dalsgaard et al. (1988) to lead to a substantial improvement in the agreement between the observed and computed frequencies. Electron degeneracy has a significant effect in the core of the model while, as already noted, relativistic effects for the electrons have been detected in helioseismic inversion (Elliott and Kosovichev 1998).

Fig. 2
figure2

Measures of non-ideal effects in the equation of state in a model of the present Sun (Model S; see Sect. 4.1), as a function of fractional radius (top panel) and temperature (bottom panel). The solid line shows \(\varGamma _{\mathrm{e}}\) (cf. Eq. 14) which measures the importance of Coulomb effects. The short-dashed line shows \(\zeta _{\mathrm{e}}\) (cf. Eq. 15) which measures effects of electron degeneracy. (Note that in \(\varGamma _{\mathrm{e}}\) and \(\zeta _{\mathrm{e}}\) the electron number density was obtained with the CEFF equation of state; see below.) The long-dashed line shows \(x_{\mathrm{e}}\) (cf. Eq. 19), the ratio between the thermal energy and rest-mass energy of electrons. Finally, the double-dot-dashed line shows \(p_{\mathrm{rad}}/p\), the ratio between radiation and total pressure

The computation of the equation of state has been reviewed by Däppen (1993, 2004, 2007, 2010), Christensen-Dalsgaard and Däppen (1992), Baturin et al. (2013). Extensive discussions of issues related to the equation of state in astrophysical systems were provided by Čelebonović et al. (2004). The procedures can be divided into what has been called the chemical picture and the physical picture. In the former, the gas is treated as a mixture of different components (molecules, atoms, ions, nuclei and electrons) each contributing to the thermodynamical quantities. Approximations to the contributions from these components are used to determine the free energy of the system, and the equilibrium state is determined by minimizing the free energy at given temperature and density, say, under the relevant stoichiometric constraints. The level of complexity and, one may hope, realism of the formulation depends on the treatment of the different contributions to the free energy. In the physical picture, the basic constituents are taken to be nuclei and electrons, and the state of the gas, including the formation of ions and atoms, derives from the interaction between these constituents. In practice, this is dealt with in terms of activity expansions (Rogers 1981), the level of complexity depending on the number of terms included.

A simple form of the chemical picture is the so-called EFF equation of state (Eggleton et al. 1973). This treats ionization with the basic Saha equation, although adding a contribution to the free energy which ensures full ionization at high electron densities. Partial degeneracy and relativistic effects are covered with an approximate expansion. Because of its simplicity it can be included directly in a stellar evolution code and hence it has found fairly widespread use; however, it is certainly not sufficiently accurate to be used for computation of realistic solar models. An extension of this treatment, the CEFF equation of state including in addition Coulomb effects treated in the Debye–Hückel approximation, was introduced by Christensen-Dalsgaard and Däppen (1992). A comprehensive equation of state based on the chemical treatment has been provided in the so-called MHDFootnote 14 equation of state (Mihalas et al. 1988, 1990; Däppen et al. 1988; Nayfonov et al. 1999). This includes a probabilistic treatment of the occupation of states in atoms and ions (Hummer and Mihalas 1988), based on the perturbations caused by surrounding neutral and charged constituents of the gas, and including excluded-volume effects. Also, Coulomb effects and effects of partial degeneracy are taken into account. The MHD treatment and other physically realistic equations of state are too complex (so far) to be included directly into stellar evolution codes. Instead, they are used to set up tables which are then interpolated to obtain the quantities required in the evolution calculation. Thus both the table properties and the interpolation procedures become important for the accuracy of the representation of the physics. Issues of interpolation were addressed by Baturin et al. (2019).

The physical treatment of the equation of state, for realistic stellar mixtures, has been developed by the OPAL group at the Lawrence Livermore National Laboratory, in what they call the ACTEX equation of state (for ACTivity EXpansion), in connection with the calculation of opacities. For this purpose it has obviously been necessary to extend the treatment to include also a determination of atomic energy levels and their perturbations from the surrounding medium. The result is often referred to as the OPAL tables. Extensive tables, in the following OPAL 1996, were initially provided by Rogers et al. (1996), with later updates presented by Rogers and Nayfonov (2002).

Interestingly, relativistic effects were ignored in the original formulations of both the MHD and the OPAL tables, while they were included, in approximate form, in the simple formulation of Eggleton et al. (1973). Following the realization by Elliott and Kosovichev (1998), based on helioseismology, that this was inadequate, updated tables taking these effects into account have been produced by Gong et al. (2001b) and Rogers and Nayfonov (2002). The latter tables, with additional updates, are known as the OPAL 2005 equation-of-state tablesFootnote 15 and are seeing widespread use.

To illustrate the effects of using the different formulations, Figs. 3 and 4Footnote 16 show relative differences in p and \(\varGamma _1\) for various equations of state at the conditions in a model of the present Sun, using the OPAL 1996 equation of state as reference. It is clear that the inclusion of Coulomb effects in CEFF captures a substantial part of the inadequacies of the simple EFF formulation, although the remaining differences are certainly very significant. In the bottom panel of Fig. 4 it should be noticed that the MHD and OPAL 1996 formulations share the lack of proper treatment of relativistic effects and hence have very similar behaviour of \(\varGamma _1\) at the highest temperatures. This is corrected in both CEFF and OPAL 2005 which therefore show very similar departures from OPAL 1996 at high temperature. A detailed comparison between the MHD and OPAL formulations was carried out by Trampedach et al. (2006).

Fig. 3
figure3

Comparison of equations of state at fixed \((\rho , T)\) and composition corresponding to the structure of the present Sun (specifically Model S of Christensen-Dalsgaard et al. 1996), in the sense (modified equation of state)–(model), plotted against the logarithm of the temperature in the model; the model used the original (OPAL 1996; Rogers et al. 1996) equation of state. The top panel shows the difference in pressure and the bottom panel the difference in \(\varGamma _1\) Solid lines show the EFF equation of state (Eggleton et al. 1973), and dashed lines the CEFF equation of state (Christensen-Dalsgaard and Däppen 1992). For the comparison the same relative composition of the heavy elements was chosen for the EFF and CEFF calculations as in the OPAL tables

Fig. 4
figure4

Note that the relative composition of the heavy elements may differ between the different implementations

As Fig. 3, but showing CEFF (black solid lines), the MHD equation of state (Mihalas et al. 1990, red dashed lines) the OPAL 2005 equation of state (Rogers and Nayfonov 2002, green dot-dashed lines), and the SAHA-S equation of state (Gryaznov et al. 2004, blue long-dashed lines).

Further developments of the MHD equation of state have been undertaken to emulate aspects of the OPAL equation of state in a flexible manner, allowing the calculation of extensive consistent and physically more realistic tables (Liang 2004; Däppen and Mao 2009), or developing a similar emulation in the simpler CEFF equation of state, which might enable bypassing the table calculations (Lin and Däppen 2010). A comprehensive update of the MHD equation of state is being prepared by R. Trampedach. The implementation of these developments in solar and stellar model calculations will be very interesting.

An independent development of an equation of state in the chemical picture has been carried out in the so-called SAHA-S formulation (Gryaznov et al. 2004; Baturin et al. 2013, 2017).Footnote 17 Results for this equation of state are shown in Fig. 4 with the blue long-dashed curve. Apart from a rather stronger variation in \(\varGamma _1\) in the atmosphere due to the wide variety of molecular species included, the SAHA-S formulation is clearly quite similar to OPAL 2005. Also, Alan W. Irwin has developed the FreeEOS formulation,Footnote 18 based on free-energy minimization (see Cassisi et al. 2003a), which allows rapid calculation of an equation of state that closely matches the OPAL equation of state.

Opacity

In stellar interiors, the diffusion approximation for radiative transfer, implied by Eq. (5), is adequate, and the opacity is determined as the Rosseland mean opacity,

$$\begin{aligned} \kappa ^{-1} \equiv \kappa _{\mathrm{R}}^{-1} = {\pi \over a {\tilde{c}}T^3} \int _0^\infty \kappa _\nu ^{-1} {\mathrm{d}B_\nu \over \mathrm{d}T} \mathrm{d}\nu \end{aligned}$$
(21)

(Rosseland 1924), where \(\kappa _\nu \) is the monochromatic opacity at (radiation) frequency \(\nu \) and \(B_\nu \) is the Planck function. The computation of stellar opacities is generally so complicated that opacities have to be obtained in stellar modelling through interpolation in tables. The computation of the tables includes contributions of transitions between the different levels of the atoms and ions in the gas, including as far as possible the effects of level perturbations; an extensive review of opacity calculations was provided by Pain et al. (2017). The thermodynamic state of the gas, including the degrees of ionization and the distribution amongst the levels, is an important ingredient in the calculation; indeed, both the MHD and the OPAL equations of state were developed as bases for new opacity calculations. Within the convection zone, solar structure is essentially independent of opacity, since the temperature gradient is nearly adiabatic. Below the convection zone the opacity is dominated by heavy elements; hence it is sensitive not only to the total heavy-element abundance Z but also to the relative distribution of the individual elements. This is illustrated in Fig. 5 showing the sensitivity of the opacity to variations in the dominant contributions to the heavy elements. Evidently iron is an important contribution to the opacity, particularly in the solar core, but other elements such as oxygen, neon and silicon also play major roles. Modelling the solar atmosphere requires low-temperature opacities, including effects of molecules; in the calculation of the structure of calibrated solar models the resulting uncertainties are largely suppressed by changes in the treatment of convection (cf. Fig. 28).

Fig. 5
figure5

Courtesy of H. M. Antia

Logarithmic derivatives of the opacity with respect to contributions to the total heavy-element abundance of the different elements indicated, evaluated for OPAL opacities (Iglesias and Rogers 1996) in the radiative part of a standard solar model. The vertical dotted line marks the temperature at the base of the convection zone in the present Sun.

Early models used for helioseismic analysis generally used the Cox and Stewart (1970) and Cox and Tabor (1976) tables. An early inference of the solar internal sound speed (Christensen-Dalsgaard et al. 1985) showed that the solar sound speed was higher below the convection zone than the sound speed of a model using the Cox and Tabor (1976) tables, prompting the suggestion that the opacity had to be increased by around 20% at temperatures higher than \(2 \times 10^6 \,\mathrm{K}\). This followed an earlier plea by Simon (1982) for a reexamination of the opacity calculations in connection with problems in the interpretation of double-mode Cepheids and in the understanding of the excitation of oscillations in \(\beta \) Cephei stars; it was subsequently demonstrated by Andreasen and Petersen (1988) that agreement between observed and computed period ratios for double-mode \(\delta \) Scuti stars and Cepheids could be obtained by a substantial opacity increase, by a factor of 2.7, in the range \(\log T = 5.2{-}5.9\).

These results motivated a reanalysis of the opacities by the Livermore group, who pointed out (Iglesias et al. 1987) that the contribution from line absorption in metals had been seriously underestimated in earlier opacity calculations. This work resulted in the OPAL tables (e.g., Iglesias and Rogers 1991; Iglesias et al. 1992; Rogers and Iglesias 1992, 1994, in the following OPAL92). Owing to the inclusion of numerous transitions in iron-group elements and a better treatment of the level perturbations and associated line broadening these new calculations did indeed show very substantial opacity increases, qualitatively matching the requirements from the helioseismic sound-speed inference; also, this led largely to agreement with evolution models of the period ratios for RR Lyrae and Cepheid double-mode pulsators (e.g., Cox 1991; Moskalik et al. 1992; Kanbur and Simon 1994) and to opacity-driven instability in the \(\beta \) Cephei models (e.g., Cox et al. 1992; Kiriakidis et al. 1992; Moskalik and Dziembowski 1992). These results are excellent examples of stellar pulsations, and in particular helioseismology, providing input to the understanding of basic physical processes.

The OPAL tables, with further developments (e.g., Iglesias and Rogers 1996, in the following OPAL96),Footnote 19 have seen widespread use in solar and stellar modelling. In parallel with the OPAL calculations, independent calculations were carried out within the Opacity Project (OP) (Seaton et al. 1994), with results in good agreement with those of OPAL96 at relatively low density and temperature, although larger discrepancies were found under conditions relevant to the solar radiative interior (Iglesias and Rogers 1995). More recent updates to the OP opacities, in the following OP05, have decreased these discrepancies substantially, to a level of 5–10% (Seaton and Badnell 2004; Badnell et al. 2005).Footnote 20 A recent effort is under way at the CEA, France, resulting in the so-called OPAS tablesFootnote 21 (Blancard et al. 2012; Mondet et al. 2015). Also, the Los Alamos group has updated their calculations, in the OPLIB tables (Colgan et al. 2016).Footnote 22 A review of these recent opacity results was provided by Turck-Chièze et al. (2016), while Fig. 6 shows a comparison of the opacity values in a model of the present Sun.

Fig. 6
figure6

Figure courtesy of Aldo Serenelli

Comparison of the OPAL, OPLIB and OPAS opacities (see text) relative to the OP opacities. The dashed curves are for the Grevesse and Sauval (1998) composition, while the solid curves are for the Asplund et al. (2009) composition (see also Sect. 6.1). From Villante, Serenelli and Vinyoles (in preparation).

The opacity tables discussed so far typically include few or no molecular lines. Thus the opacity at low temperature (often taken to be below \(10^4 \,\mathrm{K}\)) must be obtained from separate tables, suitably matched to the opacity at higher temperature. Tables provided by Kurucz (1991) and Alexander and Ferguson (1994) have often been used. A set of tables with a more complete equation of state and improved treatment of grains was provided by Ferguson et al. (2005).

I note that the potential uncertainties in the opacity calculations have gained renewed interest in connection with the apparent discrepancies between helioseismic inferences and solar models computed with revised inferences of solar surface composition. I return to this in Sect. 6.4.

Energy generation

The basic energy generation in the Sun takes place through hydrogen fusion to helium which may be schematically written as

$$\begin{aligned} 4 {{}^{1}\mathrm{H}}\rightarrow {{}^{4}\mathrm{He}}+ 2 \mathrm{e}^+ + 2 \nu _{\mathrm{e}}. \end{aligned}$$
(22)

Here the emission of the two positrons results from the required conversion of two protons to neutrons, as also implied by conservation of charge in the process, and the two electron neutrinos ensure conservation of lepton number. Evidently the positrons are immediately annihilated by two electrons, resulting in further release of energy. Thus the net reaction can formally be regarded as the fusion of four hydrogen atoms into a helium atom; this is convenient from the point of view of calculating the energy release based on tables of atomic masses. The result is that each reaction in Eq. (22) generates 26.73 MeV. However, the neutrinos have a negligible probability for interaction with matter in the Sun, and hence the energy contributed to the neutrinos must be subtracted to obtain the energy generation rate \(\epsilon \) actually available to the Sun. Thus \(\epsilon \) depends on the energy of the emitted neutrinos and hence on the details of the reactions resulting in the net reaction in Eq. (22). As discussed in Sect. 5.2 detection of the emitted neutrinos provides a crucial confirmation of the presence of nuclear reactions in the solar core and a probe of the properties of the neutrinos.

The detailed properties of nuclear reactions in stellar interiors have been discussed by, for example, Clayton (1968). Reactions require tunneling through the potential barrier resulting from the Coulomb repulsion between the two nuclei. Thus to a first approximation reactions between more highly charged nuclei are expected to have a lower probability. Also, the temperature dependence of the reactions depends strongly on the charges of the reacting nuclei. The dependence on temperature of the reaction rate \(r_{12}\) between two nuclei 1 and 2 is often approximated as \(r_{12} \propto T^n\), where

$$\begin{aligned} n = {\eta -2 \over 3}, \quad \eta = 42.487 ({{\mathcal {Z}}}_1 {{\mathcal {Z}}}_2 {{\mathcal {A}}})^{1/3} T_6^{-1/3}; \end{aligned}$$
(23)

here \({{\mathcal {Z}}}_1 e\) and \({{\mathcal {Z}}}_2 e\) are the charges of the two nuclei, \({{\mathcal {A}}} = {{\mathcal {A}}}_1 {{\mathcal {A}}}_2/( {{\mathcal {A}}}_1 + {{\mathcal {A}}}_2)\) is the reduced mass of the nuclei in atomic mass units, \({{\mathcal {A}}}_1\) and \({{\mathcal {A}}}_2\) being the masses of the nuclei, and \(T_6 = T/(10^6 \,\mathrm{K})\).Footnote 23 However, the specific properties of the interacting nuclei also play a major role for the reaction rate. Furthermore, the conversion of protons into neutrons and the production of neutrinos involve the weak interaction which takes place with comparatively low probability. This has a strong effect on the rates of reactions where this conversion takes place.

The net reaction in Eq. (22) obviously has to take place through a number of intermediate steps. The dominant series of reactions starts directly with the fusion of two hydrogen nuclei; the full sequence of reactions isFootnote 24

$$\begin{aligned} {{}^{1}\mathrm{H}}({{}^{1}\mathrm{H}}, \mathrm{e^+}\nu _{\mathrm{e}})\,{{}^{2}\mathrm{D}}({{}^{1}\mathrm{H}}, \gamma )\,{{}^{3}\mathrm{He}}({{}^{3}\mathrm{He}},2 {{}^{1}\mathrm{H}})\, {{}^{4}\mathrm{He}}. \end{aligned}$$
(24)

This sequence of reactions is known as the PP-I chain and clearly corresponds to Eq. (22). The average energy of the neutrinos lost in the first reaction in the chain is 0.263 MeV. Thus the effective energy production for each resulting \({{}^{4}\mathrm{He}}\) is 26.21 MeV.

Two alternative chains, PP-II and PP-III, continue with the fusion of \({{}^{3}\mathrm{He}}\) and \({{}^{4}\mathrm{He}}\) after the production of \({{}^{3}\mathrm{He}}\):

$$\begin{aligned} \begin{aligned} {{}^{3}\mathrm{He}}({{}^{4}\mathrm{He}}, \gamma )\,&{{}^{7}\mathrm{Be}}(\mathrm{e^-}, \nu _{\mathrm{e}})\,{{}^{7}\mathrm{Li}}({{}^{1}\mathrm{H}}, {{}^{4}\mathrm{He}})\,{{}^{4}\mathrm{He}}\qquad \qquad \,\, (\hbox {PP-II}) \\&\,\,\Downarrow \\&{{}^{7}\mathrm{Be}}({{}^{1}\mathrm{H}}, \gamma )\, {{}^{8}\mathrm{B}}(,\mathrm{e^+}\nu _{\mathrm{e}})\,{{}^{8}\mathrm{Be}}(, {{}^{4}\mathrm{He}})\,{{}^{4}\mathrm{He}}\qquad (\hbox {PP-III}) \end{aligned} \end{aligned}$$
(25)

Here the total average neutrino losses per produced \({{}^{4}\mathrm{He}}\) are 1.06 MeV and 7.46 MeV, respectively. At the centre of the present Sun the contributions of the PP-I, PP-II and PP-III reactions to the total energy generation by the PP chains, excluding neutrinos, are 23, 77 and 0.2%, respectively; owing to a much higher temperature sensitivity of the PP-II and PP-III chains the corresponding contributions to the solar luminosity are 77, 23 and 0.02%. However, even though insignificant for the energy generation, the PP-III chain is very important for the study of neutrino emission from the Sun due to the high energies of the neutrinos emitted in the decay of \({{}^{8}\mathrm{B}}\).

Of the reactions in the PP chains the initial reaction, fusing two hydrogen nuclei, has by far the lowest rate per pair of reacting nuclei. This is a result of the effect of the weak interaction in the conversion of a proton into a neutron, coupled with the penetration of the Coulomb barrier.Footnote 25 Thus the overall rate of the chains is controlled by this reaction; since the charges of the interacting nuclei is relatively low, it has a modest temperature sensitivity, approximately as \(T^4\) [cf. Eq. (23)]. The distribution of the reactions between the different branches depends on the branching ratios at the reactions destroying \({{}^{3}\mathrm{He}}\) and \({{}^{7}\mathrm{Be}}\); as a result PP-II and in particular PP-III become more important with increasing temperature, with important consequences for the neutrino spectrum of the Sun.

In principle, the full reaction network should be considered as a function of time, to follow the changing abundances resulting from the nuclear reactions. In practice the relevant reaction timescales for the reactions involving \({{}^{2}\mathrm{D}}\), \({{}^{7}\mathrm{Be}}\) and \({{}^{7}\mathrm{Li}}\) are so short that the reactions can be assumed to be in equilibrium under solar conditions (e.g., Clayton 1968); the resulting equilibrium abundances are minute.Footnote 26 On the other hand, the timescales for the reactions involving \({{}^{3}\mathrm{He}}\) are comparable with the timescale of solar evolution, at least in the outer parts of the core; thus the calculation should follow the detailed evolution with time of the \({{}^{3}\mathrm{He}}\) abundance. The resulting abundance profile in a model of the present Sun is illustrated in Fig. 7; below the maximum \({{}^{3}\mathrm{He}}\) has reached nuclear equilibrium, with an abundance that increases with decreasing temperature. The location of this maximum moves further out with increasing age. It was found by Christensen-Dalsgaard et al. (1974) that the establishment of this \({{}^{3}\mathrm{He}}\) profile caused instability to a few low-degree g modes early in the evolution of the Sun.

Fig. 7
figure7

Evolution of the abundance of \({}^3 \mathrm{He}\). The solid curve shows the abundance in a model of the present Sun, while the dotted, dashed, dot-dashed, double-dot-dashed and long-dashed curves show the abundances at ages 0.5, 1.0, 2.0, 2.9 and 3.9 Gyr, respectively. The initial abundance was assumed to be zero

The primordial abundances of light elements, as inferred from solar-system abundances, are crucial constraints on models of the Big Bang (e.g. Geiss and Gloeckler 2007). This includes the abundances of \({{}^{2}\mathrm{D}}\) and \({{}^{3}\mathrm{He}}\), with \({{}^{2}\mathrm{D}}\) burning (cf. Eq. 24) taking place at sufficiently low temperature that the primordial \({{}^{2}\mathrm{D}}\) has largely been converted to \({{}^{3}\mathrm{He}}\). The \({{}^{3}\mathrm{He}}/{{}^{4}\mathrm{He}}\) ratio can be determined from the solar wind; the resulting value can probably be taken as representative for matter in the solar convection zone and hence provides a constraint on the extent to which the convection zone has been enriched by \({{}^{3}\mathrm{He}}\) resulting from hydrogen burning. This was used by, for example, Schatzman et al. (1981), Lebreton and Maeder (1987) and Vauclair and Richard (1998) to constrain the extent of turbulent mixing beneath the convection zone. Heber et al. (2003) investigated the time variation in the \({{}^{3}\mathrm{He}}/{{}^{4}\mathrm{He}}\) ratio from analysis of lunar regolith samples. After correction for secondary processes, using the presumed constant \({}^{20}\mathrm{Ne}/{}^{22}\mathrm{Ne}\) as reference, they deduced that the \({{}^{3}\mathrm{He}}/{{}^{4}\mathrm{He}}\) ratio has been approximately constant over the past around 4 Gyr, with an average value for the ratio of number densities of \((4.47 \pm 0.13) \times 10^{-4}\). This provides a further valuable constraint on the mixing history below the solar convection zone.Footnote 27

A second set of processes resulting in the net reaction in Eq. (22) involves successive reactions with isotopes of carbon, nitrogen and oxygen:

(26)

This CNO cycle is obviously a catalytic process, with the net result of converting hydrogen into helium. The reaction with the lowest rate in this cycle is proton capture on \({{}^{14}\mathrm{N}}\) which therefore controls the overall rate of the cycle; this leads to a temperature dependence of roughly \(T^{20}\) under solar conditions, owing to the high nuclear charge of nitrogen [cf. Eq. (23)]. As a result, the CNO cycle is significant mainly very near the solar centre, and its importance increases rapidly with increasing age of the model, due to the increase in core temperature (cf. Fig. 8a). Owing to the strong temperature dependence it is strongly concentrated near the centre, as illustrated in Fig. 8b. Thus, although in the present Sun the central contribution to the energy-generation rate is 11%, the CNO cycle only contributes 1.3% to the luminosity. As a consequence of the \({}^{14}\mathrm{N}\) bottleneck in the CN cycles almost all the initial carbon is converted into nitrogen by the reactions. An additional side branch mainly serves to convert oxygen into nitrogen; under the conditions leading up to the present Sun this is relatively unimportant, causing an increase in the central abundance of \({{}^{14}\mathrm{N}}\) by around 12% in the present Sun, relative to the initial abundance.

Fig. 8
figure8

Contributions of the CNO cycle to the energy generation in a solar model. Top panel: the ratio of \(\epsilon _{\mathrm{CNO}}\) to the total \(\epsilon \) at the centre of the model, as a function of age. Bottom panel: the fractional contribution \(\epsilon _{\mathrm{CNO}}/\epsilon \) as a function of position in a model of present Sun

The computation of nuclear reaction rates requires nuclear parameters, determined from experiments or, in the case of the \({{}^{1}\mathrm{H}}+ {{}^{1}\mathrm{H}}\) reaction, from theoretical considerations. In addition to affecting the energy-generation rate the details of the reactions have a substantial effect on the branching ratios in the PP chains and hence on the production rate of the high-energy \({{}^{8}\mathrm{B}}\) neutrinos. The reaction rate, averaged over the thermal energy distribution of the nuclei, is typically expressed as a function of temperature in terms of a factor describing the penetration of the Coulomb barrierFootnote 28 and a correction factor provided as an expansion in temperature. A substantial number of compilations of data for nuclear reactions have been made, starting with the classical, and much used, sets by Fowler et al. (1967, 1975). Bahcall and Pinsonneault (1995) provided an updated set of parameters specifically for the computation of solar models. Two extensive and commonly used compilations of parameters have been provided by Adelberger et al. (1998) and Angulo et al. (1999). Revised parameters for the important reaction \({{}^{14}\mathrm{N}}({{}^{1}\mathrm{H}}, \gamma )\,{{}^{15}\mathrm{O}}\), which controls the overall rate of the CNO cycle, have been obtained (Formicola et al. 2004; Angulo et al. 2005), reducing the rate by a factor of almost 2. An updated set of nuclear parameters specifically for solar modelling was provided by Adelberger et al. (2011), including also the revised rates for \({{}^{14}\mathrm{N}}({{}^{1}\mathrm{H}}, \gamma )\,{{}^{15}\mathrm{O}}\).

The nuclear reactions take place in a plasma, with charged particles that modify the interaction between the nuclei. A classical and widely used treatment of this effect was developed by Salpeter (1954), with a mean-field treatment of the plasma in the Debye–Hückel approximation; this shows that the nuclei are surrounded by clouds of electrons which partly screen the Coulomb repulsion between the nuclei and hence increase the reaction rate. Following criticism of Salpeter’s result by Shaviv and Shaviv (1996), Brüggen and Gough (1997, 2000) made a more careful analysis of the thermodynamical assumptions underlying the derivation, confirming Salpeter’s result and in the second paper extending it to take into account quantum-mechanical exclusion and polarization of the screening cloud; in the solar case, however, such effects are largely insignificant. On the other hand, the mean-field approximation may be questionable in cases, such as the solar core, where the average number of electrons within the radius of the screening cloud is very small. This has given rise to extensive discussions of dynamic effects in the screening (e.g. Shaviv and Shaviv 2001). Bahcall et al. (2002) argued that such effects, and other claims of problems with the Salpeter formulation, were irrelevant. However, molecular-dynamics simulations of stellar plasma strongly suggest that dynamical effects may in fact substantially influence the screening (Shaviv 2004a, b). Further investigations along these lines are clearly needed. Thus it is encouraging that Mussack et al. (2007) started independent molecular-dynamics simulations. Initial results by the group (Mao et al. 2009) confirmed the earlier conclusions by Shaviv; a more detailed analysis by Mussack and Däppen (2011) found evidence for a slight reduction in the reaction rate as a result of plasma effects. Interestingly, Weiss et al. (2001) noted that the solar structure as inferred from helioseismology (cf. Sect. 5.1.2) can be used to constrain the departures from the simple Salpeter formulation; in particular, they found that a model computed assuming no screening was inconsistent with the helioseismically inferred sound speed. These issues clearly need further investigations.

Diffusion and settling

As indicated in Eq. (6) the temporal evolution of stellar internal abundances must take into account effects of diffusion and settling. Crudely speaking, settling due to gravity and thermal effects tends to establish composition gradients; diffusion, described by the diffusion coefficient \(D_i\), tends to smooth out such gradients, including those that are established through nuclear reactions. A brief review of these processes was provided by Michaud and Proffitt (1993). They were discussed in some detail already by Eddington (1926); he concluded that they might lead to unacceptable changes in surface composition unless suppressed by processes that redistributed the composition, such as circulation.

A brief review of diffusion was provided by Thoul and Montalbán (2007). The basic equations describing the microscopic motion of matter in a star are the Boltzmann equations for the velocity distribution of each type of particle. The treatment of diffusion and settling in stars has generally been based on approximate solutions of the Boltzmann equations presented by Burgers (1969). This results in a set of equations for momentum, energy and mass conservation for each species which can be solved numerically to obtain the relevant quantities such as \(D_i\) and \(V_i\) in Eq. (6). The equations depend on the collisions between particles in the gas, greatly complicated by the long-range nature of the Coulomb force between charged particles (electrons and ions); these are typically described in terms of coefficients based on the screened Debye–Hückel potential, mentioned above in connection with Coulomb effects in the equation of state and electron screening in nuclear reactions, and depending on the ionization state of the ions. As emphasized initially by Michaud (1970) the gravitational force on the particles may be modified by radiative effects, depending on the detailed ionization and excitation state of the individual species and hence varying strongly between different elements or with position in the star.Footnote 29 It should be noted that the typical diffusion and settling timescales, although possibly short on a stellar evolution timescale, are generally much longer than the timescales associated with large-scale hydrodynamical motions. Thus regions affected by such motion, particularly convection zones, can generally be assumed to be fully mixed; in the solar case microscopic diffusion and settling is only relevant beneath the convective envelope. Formally, hydrodynamical mixing can be incorporated by maintaining Eq. (6) but with a very large value of \(D_i\) (e.g., Eggleton 1971).

Michaud and Proffitt (1993) presented relatively simple approximations to the diffusion and settling coefficients for hydrogen as well as for heavy elements regarded as trace elements (see also Christensen-Dalsgaard 2008). These were based on solutions of Burger’s equations, adjusting coefficients to obtain a reasonable fit to the numerical results. These approximations were also compared with the results of the numerical solutions by Thoul et al. (1994) who in addition presented simpler, and rather less accurate, approximate expressions for the coefficients.

Although diffusion and settling have been considered since the early seventies (e.g., Michaud 1970) to explain peculiar abundances in some stars, it seems that Noerdlinger (1977) was the first to include these effects in solar modelling; indeed, the early estimates by Eddington (1926) suggested that the effects would be fairly small. In fact, including helium diffusion and settling Noerdlinger found a reduction of about 0.023 in the surface helium abundance \(Y_{\mathrm{s}}\), from the initial value. Roughly similar results were obtained by Gabriel et al. (1984) and Cox et al. (1989), the latter authors considering a broad range of elements, while Wambsganss (1988) found a much smaller reduction. Proffitt and Michaud (1991) provided a detailed comparison of these early results, although without explaining the discrepant value found by Wambsganss. Bahcall and Pinsonneault (1992a, b) made careful calculations of models with helium diffusion and settling, using the then up-to-date physics, and emphasizing the importance of calibrating the models to yield the observed present surface ratio \(Z_{\mathrm{s}}/X_{\mathrm{s}}\) between the abundances of heavy elements and hydrogen; they found that the inclusion of diffusion and settling increased the neutrino capture rates from the models by up to around 10%. A careful analysis of the effects of heavy-element diffusion and settling on solar models and their neutrino fluxes was presented by Proffitt (1994).

Gabriel et al. (1984) concluded that the inclusion of helium diffusion and settling had little effect on the oscillation frequencies of the model, while Cox et al. (1989), in their more detailed treatment, actually found that the model with diffusion and settling showed a larger difference between observed and model frequencies than did the model that did not include these effects. However, Christensen-Dalsgaard et al. (1993) showed that the inclusion of helium diffusion and settling substantially decreased the difference in sound speed between the Sun and the model, as inferred from a helioseismic differential asymptotic inversion. Further inverse analyses of observed solar oscillation frequencies have confirmed this result, thus strongly supporting the reality of these effects in the Sun and contributing to making diffusion and settling a part of ‘the standard solar model’ (e.g., Christensen-Dalsgaard and Di Mauro 2007). Further evidence is the difference between the initial helium abundance required to calibrate solar models and the helioseismically inferred envelope helium abundance (see Sect. 5.1.2), which is largely accounted for by the effects of helium settling.

Detailed calculations of atomic data for the OPAL and OP opacity projects (cf. Sect. 2.3.2) have allowed precise calculations of the radiative effects on settling (Richer et al. 1998). As mentioned above such effects are highly selective, affecting different elements differently. As a result, not only does the heavy-element abundance change as a result of settling, but the relative mixture of the heavy elements varies as a function of stellar age and position in the star. As is evident from Fig. 5 this has a substantial effect on the opacities. To take such effects consistently into account the opacities must therefore be calculated from the appropriate mixture at each point in the model, requiring appropriately mixing monochromatic contributions from individual elements and calculating the Rosseland mean (cf. Eq. 21). Such calculations are feasible (Turcotte et al. 1998) although obviously very demanding on computing resources in terms of time and storage. Turcotte et al. (1998) carried out detailed calculations of this nature for the Sun. Here the relatively high temperatures and resulting ionization beneath the convective envelope, where diffusion and settling are relevant, result in modest effects of radiative acceleration and little variation in the relative heavy-element abundances. In fact, Fig. 14 of Turcotte et al. shows that neglecting radiative effects and assuming all heavy elements to settle at the same rate, corresponding to fully ionized oxygen, yield results somewhat closer to the full detailed treatment than does neglecting radiative effects and taking partial ionization fully into account. The rather reassuring conclusion is that, as far as solar modelling is concerned, the simple procedure of treating all heavy elements as one is adequate (see also Turcotte and Christensen-Dalsgaard 1998). This simpler approach, neglecting radiative effects, is in fact what is used for the models presented here.

The timescale of diffusion and settling, defined by Eq. (6), increases with increasing density and hence with depth beneath the stellar surface, as illustrated in Fig. 9. Since the convective envelope is fully mixed, the relevant timescale controlling the efficiency of diffusion is the value just below the convective envelope. In the solar case this is of order \(10^{11}\) years, resulting in a modest effect of diffusion over the solar lifetime. In somewhat more massive main-sequence stars, however, with thinner outer convection zones, the time scale is short compared with the evolution timescale; in the case illustrated for a \(2 \,M_\odot \) star, for example, it is around \(5 \times 10^6\) years. Thus settling has a dramatic effect on the surface abundance unless counteracted by other effects (Vauclair et al. 1974). This leads to a strong reduction in the helium abundance, likely eliminating instability due to helium driving in stars that might otherwise be expected to be pulsationally unstable (Turcotte et al. 2000). Also, differential radiative acceleration leads to a surface mixture of the heavy elements very different from the solar mixture, which is indeed observed in ‘chemically peculiar stars’, as already noted by Michaud (1970). Richer et al. (2000) pointed out that to match the observed abundances even in these cases compensating effects had to be included to reduce the effects of settling; they suggested either sub-surface turbulence, increasing the reservoir from which settling takes place, or mass loss bringing fresh material less affected by settling to the surface. An interesting analysis of these processes in controlling the observed abundances of Sirius was presented by Michaud et al. (2011). To obtain ‘normal’ composition in such stars, processes of this nature reducing the effects of settling are a fortiori required;Footnote 30 since most main-sequence stars somewhat more massive than the Sun rotate relatively rapidly, circulation or hydrodynamical instabilities induced by rotation are likely candidates (e.g., Zahn 1992, see also Sect. 7). Deal et al. (2020) investigated the combined effects of rotation and radiatively affected diffusion in main-sequence stars and found that this could account for the observed surface abundances for stars with masses below \(1.3\,M_\odot \). For more massive stars additional mixing processes appeared to be required. It should also be noted that such hydrodynamical models of the evolution of rotation are unable to account for the rotation observed in the solar interior (see Sect. 5.1.4). A complete model of the transport of composition and angular momentum in stellar interiors remains to be found.

Fig. 9
figure9

Image reproduced with permission from Aerts et al. (2010), copyright by Springer

Diffusion timescales for helium, defined by the term in \(V_i X_i\) in Eq. (6), for a model of the present Sun (dashed) and a zero-age main-sequence \(2 \,M_\odot \) model (continuous). The thinner red parts of the curves mark the fully mixed convection zones.

The near-surface layer

The treatment of the outermost layers of the model is complicated and affected by substantial physical uncertainties. In the atmosphere the diffusion approximation for radiative transport, implicit in Eq. (5), is no longer valid; here the full radiative-transfer equations need to be considered, including the details of the frequency dependence of absorption and emission. Such detailed stellar atmosphere models are available and can in principle be incorporated in the full solar model (e.g., Kurucz 1991, 1996; Gustafsson et al. 2008). However, additional complications arise from the effects of convection which induce motion in the atmosphere as well as strong lateral inhomogeneities in the thermal structure. Also, observations of the solar atmosphere strongly indicate the importance of non-radiative heating processes in the upper parts of the atmosphere, likely caused by acoustic or magnetic waves, or other forms of magnetic energy dissipation, for which no reliable models are available. The thermal structure just beneath the photosphere is strongly affected by the transition to convective energy transport, which determines the temperature gradient \(\nabla = \nabla _{\mathrm{conv}}\). Also, in this region convective velocities are a substantial fraction of the speed of sound, leading to significant momentum transport by convection described as a ‘turbulent pressure’, but most often ignored in the model calculations.

From the point of view of the global structure of the Sun, these near-surface problems are of lesser importance. In most of the convection zone the temperature gradient is very nearly adiabatic, \(\nabla \simeq \nabla _{\mathrm{ad}}\) (see also Fig. 12). Thus the structure is essentially determined by the (constant) value of the specific entropy \(s_{\mathrm{conv}}\); in other words, the variations of the thermodynamical quantities within this part of the convection zone lie on an adiabat. In fact, if the further approximation of a fully ionized ideal gas is made, such as is roughly valid except in the outer few per cent of the solar radius, \(\nabla _{\mathrm{ad}}\simeq 2/5\), \(\mathrm{d}\ln p / \mathrm{d}\ln \rho \simeq 5/3\), and the relation between pressure and density can be approximated by

$$\begin{aligned} p = K \rho ^\gamma , \end{aligned}$$
(27)

with \(\gamma = 5/3\). In this case, therefore, the properties of the convection zone are characterized by the adiabatic constant K. Such an approximation was generally used in early calculations of solar models (e.g., Schwarzschild et al. 1957). The structure of the convection zone determines its radial extent and hence affects the radius of the model. In the solar case the radius is known observationally with high precision; thus the adiabat of the adiabatic part of the convection zone [i.e., the value of K in the approximation in Eq. (27)] must therefore be chosen such that the model has the observed radius. This is part of the calibration of solar models (see Sect. 2.6).

From this point of view the details of the treatment of the near-surface layers serve to determine \(s_{\mathrm{conv}}\) (or K). This is obtained from the specific entropy at the bottom of the atmosphere through the change in entropy resulting from integrating \(\nabla - \nabla _{\mathrm{ad}}\) over the significantly superadiabatic part of the convection zone. The treatment of convection typically involves parameters that can be adjusted to control the adiabat and hence the radius of the model; given such calibration to solar radius, the structure of the deeper parts of the model is largely insensitive to the details of the treatment of the atmosphere and the convective gradient (for an example, see Fig. 31 below).

I note that although the detailed modelling of the near-surface layers has modest effect on the internal properties of calibrated solar models, they have a substantial effect on the computed oscillation frequencies which may affect the analysis of observed frequencies (see Sect. 5.1.1). Also, in computations of other stars no similar calibration based on the observed properties is generally possible. It is customary to apply solar-calibrated convection properties in these cases; although this is clearly not a priori justified, some support at least for only modest variations relative to the Sun over a substantial range of stellar parameters has been found from hydrodynamical simulations of near-surface convection (cf. Fig. 11).

Although the atmospheric structure can be implemented in terms of reasonably realistic models of the solar atmosphere, the usual procedure in modelling solar evolution is to base the atmospheric properties on a simple relation between temperature and optical depth \(\tau \), \(T = T(\tau )\); here \(\tau \) is defined by

$$\begin{aligned} {\mathrm{d}\tau \over \mathrm{d}r} = - \kappa \rho , \end{aligned}$$
(28)

with \(\tau = 0\) at the top of the atmosphere. This \(T(\tau )\) relation is often expressed on the form

$$\begin{aligned} T^4 = {3 \over 4} T_{\mathrm{eff}}^4 [\tau + q(\tau )], \end{aligned}$$
(29)

defining the (generalized) Hopf function q.Footnote 31 Given \(T(\tau )\), and the equation of state and opacity as functions of density and temperature, the atmospheric structure can be obtained by integrating the equation of hydrostatic support, which may be written as

$$\begin{aligned} {\mathrm{d}p \over \mathrm{d}\tau } = {g \over \kappa }, \end{aligned}$$
(30)

where the gravitational acceleration g can be taken to be constant, at least for main-sequence stars such as the Sun. This defines the photospheric pressure, e.g. at the point where \(T = T_{\mathrm{eff}}\), the effective temperature, and hence the outer boundary condition for the integration of the full equations of stellar structure.Footnote 32 The \(T(\tau )\) relation can be obtained from fitting to more detailed theoretical atmospheric models, as done, for example, by Morel et al. (1994), who used the Kurucz (1991) models. Alternatively, a fit to a semi-empirical model of the solar atmosphere can be used, such as the Krishna Swamy fit (Krishna Swamy 1966) or the Harvard-Smithsonian Reference Atmosphere (Gingerich et al. 1971). As an example, the Vernazza et al. (1981) Model C \(T(\tau )\) relation is shown in Fig. 10; here is also shown the result of using the following approximation for the Hopf function in Eq. (29):

$$\begin{aligned} q(\tau ) = 1.036 -0.3134 \exp (-2.448 \tau ) -0.2959 \exp (-30 \tau ). \end{aligned}$$
(31)

The approximation provides a reasonable fit to the observationally inferred temperature structure in that part of the atmosphere which dominates the determination of the photospheric pressure.

Fig. 10
figure10

Comparison of the temperature structure in Model C of Vernazza et al. (1981) (dashed curve), against monochromatic optical depth \(\tau \) at \(500 \,\mathrm{nm}\), and the fit given in Eq. (31) (solid curve). The red dot-dashed curve shows the \(T(\tau )\) relation, against Rosseland mean opacity, obtained from matching a 3D hydrodynamical simulation (Trampedach et al. 2014b, see also Sect. 2.5)

\(T(\tau )\) relations based on a solar \(q(\tau )\) are often used for general modelling of stars, even though the atmospheric structure may have substantial variations with stellar properties. An interesting alternative is to determine \(q(\tau )\), as a function of stellar parameters, from averaged hydrodynamical simulations of the stellar near-surface layers (e.g. Trampedach et al. 2014b). An example based on a simulation for the present Sun is also shown in Fig. 10.

Treatment of convection

A detailed review of observational and theoretical aspects of solar convection was provided by Nordlund et al. (2009), while Rincon and Rieutord (2018) focused on the largest clearly observed scale of convection on the solar surface, the supergranulation. Further details, including the treatment of convection in a time-dependent environment such as a pulsating star, were reviewed by Houdek and Dupret (2015). As discussed below, extensive hydrodynamical simulations have been carried out of the near-surface convection in the Sun and other stars. However, direct inclusion of these simulations in stellar evolution calculations is impractical, owing to the computational expense; thus we must rely on simpler procedures. It is obviously preferable to have a physically motivated description of convection; as discussed above (see also Sect. 2.6), solar modelling requires one or more parameters which can be used to adjust the specific entropy in the adiabatic part of the convection zone and hence the radius of the model. In stellar modelling convection is typically treated by means of some variant of mixing-length model (e.g. Biermann 1932; Vitense 1953; Böhm-Vitense 1958); a more physically-based derivation of the description was provided by Gough (1977a, b), in terms of the linear growth and subsequent dissolution of unstable modes of convection. In the commonly used physical description of this prescriptionFootnote 33 (for further details, see Kippenhahn et al. 2012) convection is described by the motion of blobs over a distance \(\ell \), after which the blob is dissolved in the surroundings, giving up its excess heat. If the temperature difference between the blob and the surroundings is \(\varDelta T\) and the typical speed of the blob is \(v\), the convective flux is of order \(F_{\mathrm{con}}\sim v c_p \rho \varDelta T\), where \(c_p\) is the specific heat at constant pressure. Assuming, for simplicity, that the motion of the blob takes place adiabatically, \(\varDelta T \sim \ell T (\nabla - \nabla _{\mathrm{ad}}) /H_p\), where \(H_p = - (\mathrm{d}\ln p/\mathrm{d}r)^{-1}\) is the pressure scale height. Also, the speed of the element is determined by the work of the buoyancy force \(- \varDelta \rho g\) on the element, where \(\varDelta \rho \sim - \rho \varDelta T/T\) is the density difference between the blob and the surroundings, assuming the ideal gas law and pressure equilibrium between the blob and the surroundings. This gives \(\rho v^2 \sim - \ell g \varDelta \rho \sim \rho \ell ^2 g (\nabla - \nabla _{\mathrm{ad}})/H_p\). Thus we finally obtainFootnote 34

$$\begin{aligned} F_{\mathrm{con}}\sim \rho c_p T {\ell ^2 g^{1/2} \over H_p^{3/2}} (\nabla - \nabla _{\mathrm{ad}})^{3/2}. \end{aligned}$$
(32)

To this must be added the radiative flux

$$\begin{aligned} F_{\mathrm{rad}}= {4 a {\tilde{c}}T^4 \over 3 \kappa \rho } {\nabla \over H_p} \end{aligned}$$
(33)

(cf. Eq. 5); the total flux \(F = F_{\mathrm{con}}+ F_{\mathrm{rad}}\) must obviously match \(L/(4 \pi r^2)\), for equilibrium. This condition determines the temperature gradient in this description.

This description obviously depends on the choice of \(\ell \); this is typically also regarded as a measure of the size of the convective elements. An almost universal, if not particularly strongly physically motivated, choice of \(\ell \) is to take it as a multiple of the pressure scale height,

$$\begin{aligned} \ell = \alpha _{\mathrm{ML}}H_p. \end{aligned}$$
(34)

From Eq. (32) it is obvious that \(F_{\mathrm{con}}\) then scales as \(\alpha _{\mathrm{ML}}^2\). Adjusting \(\alpha _{\mathrm{ML}}\) therefore modifies the convective efficacy and hence the superadiabatic gradient \(\nabla - \nabla _{\mathrm{ad}}\) required to transport the energy, thus fixing the specific entropy in the deeper parts of the convection zone. This in turn affects the structure of the convection zone, including its radial extent, and hence the radius of the star. As discussed in Sect. 2.6 the requirement that models of the present Sun have the correct radius is typically used to determine a value of \(\alpha _{\mathrm{ML}}\), which is then often used for the modelling of other stars.

In practice, further details are added. These involve a more complete thermodynamical description, the inclusion of factors of order unity in the relation for the average velocity and energy flux and expressions for the heat loss from the convective element. Although not of particular physical significance, the choice made for these aspects obviously affects the final expressions and must be taken into account in comparisons between different calculations, particularly when it comes to the value of \(\alpha _{\mathrm{ML}}\) required to calibrate the model. A detailed description of a commonly used formulation was provided by Böhm-Vitense (1958). It was pointed out by Gough and Weiss (1976) (see also Sect. 2.4) that solar models, with the appropriate calibration of the relevant convection parameters to obtain the proper radius, are largely insensitive to the details of the treatment of convection, although the specific values of \(\alpha _{\mathrm{ML}}\) may obviously differ. It is important to keep this in mind when comparing independent solar and stellar models. As an additional point I note that the preceding description is entirely local: it is assumed that \(F_{\mathrm{con}}\) is determined by conditions at a given point in the model, leading effectively to a relation of the form (9).

The motion of the convective elements also leads to transport of momentum which, when averaged, appears as a contribution to hydrostatic support in the form of a turbulent pressure of order

$$\begin{aligned} p_{\mathrm{t}}\sim \rho v^2 \sim {\rho \ell ^2 g \over H_p} (\nabla - \nabla _{\mathrm{ad}}) \; . \end{aligned}$$
(35)

Correspondingly, hydrostatic equilibrium, Eq. (1), is expressed in terms of \(p = p_{\mathrm{g}} + p_{\mathrm{t}}\), where \(p_{\mathrm{g}}\) is the thermodynamic pressure. On the other hand, the superadiabatic gradient \(\nabla - \nabla _{\mathrm{ad}}\) in Eqs. (32) and (35) is essentially a thermodynamic property and hence is determined by the gradient in \(p_{\mathrm{g}}\) or, if expressed in terms of p and \(p_{\mathrm{t}}\), the gradient of \(p_{\mathrm{t}}\). Consequently, including \(p_{\mathrm{t}}\) consistently in Eq. (1) increases the order of the system of differential equations within the convection zone, leading to severe numerical difficulties at the boundaries of the convection zone where the order changes (e.g., Stellingwerf 1976; Gough 1977b). A detailed analysis of the resulting singular points at the convection-zone boundaries was carried out by Gough (1977a). As a result, although the effect of the turbulent pressure on the hydrostatic structure has been included in some calculations based on a local treatment of convection (e.g., Henyey et al. 1965; Kosovichev 1995) \(\nabla - \nabla _{\mathrm{ad}}\) has generally been determined from the total pressure, thus avoiding the difficulties at the boundaries of the convection zone, but introducing some inconsistency (e.g. Baker and Gough 1979).

It is obvious that the local treatment of convection is an approximation, even in the simple physical picture employed here: a convective element senses conditions over a range of depths in the Sun during its motion; similarly, the convective flux at a given location must arise from an ensemble of convective elements originating at different depths. This indicates the need for a non-local description of convection, involving some averaging over the travel of a convective element and the elements contributing to the flux. Noting the similarity to the non-local nature of radiative transfer Spiegel (1963) proposed an approximation to this averaging akin to the Eddington approximation, leading to a set of local differential equations, albeit of higher order, to describe the convective properties (see also Gough 1977a). This was implemented by Balmforth and Gough (1991) and Balmforth (1992).Footnote 35 An advantage of the non-local formulation is that it bypasses the singularities caused by a consistent treatment of turbulent pressure in a local convection formulation; interestingly, Balmforth (1992) showed that the common inconsistent local treatment has a non-negligible effect on the properties of the model, compared with the local limit of the non-local treatment.

Alternative formulations for the convective properties have been developed on the basis of statistical descriptions of turbulence, thus including the full spectrum of convective eddies (e.g., Xiong 1977, 1989; Canuto and Mazzitelli 1991; Canuto et al. 1996) (for a more detailed discussion of such Reynolds stress models, see Houdek and Dupret 2015). Even so, the descriptions typically contain an adjustable parameter, most commonly related to a length scale, allowing the calibration of the surface radius of solar models.

A more physical description of convection is possible through numerical simulation (see, Nordlund et al. 2009; Freytag et al. 2012). In practice this is restricted to fairly limited regions near the stellar surface, and even then requires simplified descriptions of the behaviour on scales smaller than the numerical grid.Footnote 36 Detailed modelling, including radiative effects in the stellar atmosphere, has been carried out by, for example, Stein and Nordlund (1989, 1998) and Wedemeyer et al. (2004). This also includes treatments of the equation of state and opacity which are consistent with global stellar models and hence immediately allow comparison with such models. Magic et al. (2013) and Trampedach et al. (2013) presented extensive grids of simulations for a range of stellar parameters, covering the main sequence and the lower part of the red-giant branch.

The simulations provide an alternative to the usual simplified stellar atmosphere models, which are assumed to be time independent and homogeneous in the horizontal direction. A very interesting aspect is that spectral line profiles calculated from the simulations and suitably averaged are in excellent agreement with observations, without the conventional ad hoc inclusion of additional line broadening through ‘microturbulence’ (e.g., Asplund et al. 2000). Also, the simulations provide a very good fit to the observed solar limb darkening, i.e., the variation across the solar disk of the intensity (Pereira et al. 2013).

The simulations of solar near-surface convection typically extend sufficiently deeply to cover that part of the convection zone where the temperature gradient is substantially superadiabatic (see Fig. 12). Thus they essentially define the specific entropy of the adiabatic part of the convection zone and hence fix the depth of the convection zone. Rosenthal et al. (1999) utilized this by extending an averaged simulation by means of a mixing-length envelope. Interestingly, they found that the resulting convection-zone depth was essentially consistent with the depth inferred from helioseismology (cf. Sect. 5.1.2), thus indicating that the simulation had successfully matched the actual solar adiabat.

As a generalization of these investigations, the simulations can be included in stellar modelling through grids of atmosphere models or suitable parameterization of simple formulations. A convenient procedure is to determine an effective mixing-length parameter \(\alpha _{\mathrm{ML}}(T_{\mathrm{eff}}, g)\) as a function of effective temperature and surface gravity, such as to reproduce the entropy of the adiabatic part of the convection zone (e.g., Ludwig et al. 1999, 2008; Trampedach et al. 1999, 2014a; Magic et al. 2015). It should be noted that since \(\alpha _{\mathrm{ML}}\) determines the entropy jump from the atmosphere to the interior of the convection zone, this calibration is intimately tied to the assumed atmospheric structure, e.g., specified by a \(T(\tau )\) relation also obtained from the simulations (Trampedach et al. 2014b). As an example, Fig. 11 shows the calibrated \(\alpha _{\mathrm{ML}}\) obtained by Trampedach et al. (2014a), as a function of \(T_{\mathrm{eff}}\) and \(\log g\). Interestingly, the variation of \(\alpha _{\mathrm{ML}}\) is modest in the central part of the diagram, along the evolution tracks of stars close to solar. Preliminary evolution calculations using these calibrations were carried out by Salaris and Cassisi (2015) and Mosumgaard et al. (2017, 2018). A similar analysis based on the calibration of the mixing-length parameter was carried out by Spada et al. (2018). As an alternative to use the fitted mixing length, Jørgensen et al. (2017) developed a method to include in stellar modelling the averaged structure of the near-surface layers obtained by interpolating in a grid of simulations. This was used by Jørgensen et al. (2018) to calculate a solar-evolution model incorporating such averaged structure in all models along the evolution track; similarly, Mosumgaard et al. (2020) calculated stellar evolution tracks for a range of masses, including the interpolated simulations along the evolution.

Fig. 11
figure11

Image reproduced with permission from Trampedach et al. (2014a), copyright by the authors

Mixing-length parameter \(\alpha _{\mathrm{ML}}\) obtained by fitting averaged 3D radiation-hydrodynamical simulations to stellar envelope models based on the Böhm-Vitense (1958) mixing-length treatment, shown using the colour scale, against effective temperature \(T_{\mathrm{eff}}\) (on a logarithmic scale) and \(\log g\). This is based on a fit to the simulations indicated by asterisks and the solar simulation shown with \(\odot \). Stellar evolution tracks, computed with the MESA code (Paxton et al. 2011), are shown for masses between 0.65 and \(4.5 \,M_\odot \), as indicated; the dashed segments mark pre-main-sequence evolution.

Apart from the calibration to match the solar radius (cf. Sect. 2.6) tests of the mixing-length parameter and its possible dependence on stellar properties can be carried out by comparing observations and models of red giants, whose effective temperature depends on the assumed \(\alpha _{\mathrm{ML}}\) (Salaris et al. 2002). A recent analysis was carried out by Tayar et al. (2017) based on APOGEE and Kepler observations, comparing with models computed with the YREC code (van Saders and Pinsonneault 2012). The model fits indicated a significant dependence on stellar metallicity, with \(\alpha _{\mathrm{ML}}\) increasing with increasing metallicity. Interestingly, calibrations based on 3D simulations (Magic et al. 2015) did not show this trend, nor did the results obtained by Tayar et al. match the values obtained by Trampedach et al. (2014a), shown in Fig. 11. However, it should be recalled that the effect of \(\alpha _{\mathrm{ML}}\) on stellar structure depends on other parameters in the mixing-length treatment, as well as on the assumed atmospheric structure and physics of the near-surface layers. Thus comparison of numerical values of \(\alpha _{\mathrm{ML}}\) or trends with, e.g., metallicity requires some care; the discrepancies may be caused by differences in other aspects of the modelling. In fact, in a detailed analysis Salaris et al. (2018), carefully taking into account the other uncertainties in the modelling of the near-surface layers, were unable to reproduce the results of Tayar et al. (2017); on the other hand, they did find some issues when \(\alpha \)-enhanced stars were included in the sample.

A comparison between different formulations of near-surface convection is provided in Fig. 12, in a format introduced by Gough and Weiss (1976). The complete solar models, corresponding to Model S of Christensen-Dalsgaard et al. (1996), have been calibrated to the same solar radius (cf. Sect. 2.6) through the adjustment of suitable parameters; this yields a depth of the convection zone which is essentially consistent with the helioseismically determined value. Evidently, regardless of the convection treatment the region of substantial superadiabatic gradient \(\nabla - \nabla _{\mathrm{ad}}\) is confined to the near-surface layers, as would also be predicted from the simple analysis given above (cf. Eq. 32). Using the Canuto and Mazzitelli (1991) formulation leads to a rather higher and sharper peak in the superadiabatic gradient than for the Böhm-Vitense (1958) mixing-length formulation. On the other hand, it is striking that the detailed behaviour of the averaged superadiabatic gradient resulting from the Trampedach et al. (2013) simulation is in reasonable agreement with the results of the calibrated mixing-length treatment. As already noted, it also appears to lead to the correct adiabat in the deeper parts of the convection zone.

Fig. 12
figure12

(Adapted from Gough and Weiss 1976)

Properties of the solar convection zone. The lower abscissa is depth below the location where the temperature equals the effective temperature, whereas the upper abscissa is pressure p. The solid curve shows the superadiabatic gradient \(\nabla - \nabla _{\mathrm{ad}}\) in Model S of Christensen-Dalsgaard et al. (1996), using the Böhm-Vitense (1958) mixing-length treatment of convection, and the horizontal arrows indicate the extents of the hydrogen and helium ionization zones in this model. Also, the short-dashed curve shows \(\nabla - \nabla _{\mathrm{ad}}\) in a model corresponding to Model S, including calibration to the same surface radius, but using the Canuto and Mazzitelli (1991) treatment of convection, and the heavy long-dashed curve shows \(\nabla - \nabla _{\mathrm{ad}}\) in an average model resulting from hydrodynamical simulations of near-surface convection (Trampedach et al. 2013). The heavy dot-dashed line shows the mean superadiabatic gradient in a hydrodynamical simulation (Featherstone and Hindman 2016), excluding the outer parts of the convection zone; the initial increase in the most shallow part of the simulation is an artifact of the imposed boundary condition.

Physically realistic simulations of near-surface convection have been carried out extending over 96 Mm in the horizontal direction, thus for the first time also including the scale of supergranules, and to a depth of 20 Mm, around 10% of the convection zone (Stein et al. 2006, 2009; Nordlund and Stein 2009).Footnote 37 Simulations have also been carried out which cover the bulk of the convection zone, but excluding the near-surface region: it is very difficult to include the very disparate range of temporal and spatial scales needed to cover the entire convection zone. Also, the microphysics of such simulations are typically somewhat simplified. On the other hand, the simulations take rotation into account, in an attempt to model the transport of angular momentum and hence the source of the surface differential rotation (cf. Eq. 11) and the variation of rotation within the convection zone (see also Sect. 5.1.4). A detailed review of these simulations was provided by Miesch (2005). As an example of their relation to global solar structure, Fig. 12 includes the average superadiabatic gradient from such a simulation, appropriately located relative to the global models. Apart from boundary effects the simulation is clearly in relatively good agreement with the simplified treatment, in particular confirming that this part of the model is very nearly isentropic.

An interesting issue was raised by Hanasoge et al. (2012) concerning the validity of the deeper convection simulations: based on local helioseismology (see Gizon and Birch 2005) using the time distance technique they obtained estimates of the convective velocity one or two orders of magnitude lower than obtained in the simulations, or indeed predicted from the simple estimate in Eq. (32). This was questioned in an analysis using the ring-diagram technique (Greer et al. 2015), who obtained results similar to those of the simulations. However, Hanasoge et al. (2020) showed, using a helioseismic technique based on coupling of mode eigenfunctions, that large-scale turbulence in the Sun is strongly suppressed compared with the results of global numerical simulations. Thus there is increasing observational evidence for possible limitations in our understanding of the dynamics of convection in the Sun, in particularly at larger scales, where there is essentially no observational evidence for structured flows, unlike what is seen in global simulations of the solar convection zone (for a review, see Miesch 2005). A review of the helioseismic inferences of solar convection was provided by Hanasoge et al. (2016). Simulations by Cossette and Rast (2016) indicated that supergranules might be the largest coherent scales of convection, with energy transport in the deeper, essentially adiabatically stratified, parts of the convection zone being dominated by colder compact downflowing plumes. For a recent short review on solar convection, see Rast (2020).

Calibration of solar models

The Sun is unique amongst stars in that we have accurate determinations of its mass, radius and luminosity and an independent and relatively precise measure of its age from age determinations of meteorites (see Sect. 2.2). It is obvious that solar models should satisfy these constraints, as well as other observed properties of the Sun, particularly the present ratio between the abundances of heavy elements and hydrogen. Ideally, the constraints would provide tests of the models; in practice, the modelling includes a priori three unknown parameters which must be adjusted to match the observed properties: the initial hydrogen and heavy-element abundances \(X_0\) and \(Z_0\) and a parameter characterizing the efficacy of convection (see Sect. 2.5). This adjustment constitutes the calibration of solar models.

Some useful understanding of the sensitivity of the models to the parameters can be obtained from simple homology arguments (e.g., Kippenhahn et al. 2012). According to these, the luminosity approximately scales with mass and composition as

$$\begin{aligned} L \propto Z^{-1} (1 + X)^{-1} M^{5.5} \mu ^{7.5}, \end{aligned}$$
(36)

assuming Kramers opacity, with \(\kappa \propto Z (1 + X) \rho T^{-3.5}\), and with \(\mu \) given by Eq. (13). Obviously, the strong sensitivity to the average mean molecular weight means that relatively modest changes in the helium abundance can lead to the correct luminosity.

As discussed above, the efficacy of convection in the near-surface layers determines the specific entropy in the adiabatic part of the convection zone and hence the structure of the convection zone, thus controlling its extent and hence the radius of the model. (When the composition is fixed by obtaining the correct luminosity the extent of the radiative interior is largely determined.) With increasing efficacy the superadiabatic temperature gradient \(\nabla - \nabla _{\mathrm{ad}}\) required to transport the flux is decreased; hence the temperature in the convection zone is generally lower, the density (at given pressure) therefore higher, and the mass of the convection zone occupies a smaller volume, and hence a smaller extent in radius. Thus the radius of the model decreases with increasing efficacy. The actual reaction of the model is substantially more complex but leads to the same qualitative result.

As discussed in Sect. 2.5, the treatment of convection and hence the properties of the superadiabatic temperature gradient are typically obtained from the mixing-length treatment. According to Eqs. (32) and (34), assuming that \(F_{\mathrm{con}}\) carries most of the flux and is therefore essentially fixed, an increase in \(\alpha _{\mathrm{ML}}\) causes an increase in the convective efficacy and hence a decrease in \(\nabla - \nabla _{\mathrm{ad}}\), corresponding, according to the above argument, to a decrease in the model radius. Thus by adjusting \(\alpha _{\mathrm{ML}}\) a model with the correct radius can be obtained. In other simplified convection treatments, such as that of Canuto and Mazzitelli (1991), a similar efficiency parameter is typically introduced to allow radius calibration. When \(\alpha _{\mathrm{ML}}\) is obtained through fitting to 3D simulations (cf. Fig. 11) there is no a priori guarantee that this yields the value required to obtain the correct solar radius. In this case a correction factor can be applied to achieve the proper solar calibration (Mosumgaard et al. 2017). Of course, if the simulations provide a good representation of the outermost layers of the Sun, as already found to be the case by Rosenthal et al. (1999), this factor would be close to one, as has indeed been found in practice. The same correction factor is then applied when the fit to the 3D simulations are used for more general stellar modelling.

The details of the calibration depend on whether or not diffusion and settling are included. If these effects are ignored the surface composition of the model hardly changes between the zero-age main sequence and the present age of the Sun. Although the present surface abundance \(X_{\mathrm{s}}\) of hydrogen is affected by the calibration of \(X_0\) the range of variation is typically so small that it can be ignored, and the (constant, in space and time) value of the heavy-element abundance, and hence \(Z_0\), is fixed from \(Z_{\mathrm{s}}/X_{\mathrm{s}}\) and some suitable characteristic value of X. On the other hand, if diffusion and settling are included the change in the convection-zone composition must be taken into account and the value of \(Z_0\) must be adjusted to match properly \(Z_{\mathrm{s}}/X_{\mathrm{s}}\).

The formal calibration problem is then, when including diffusion and settling, to determine the set of parameters \(\{p_i\} = \{X_0, Z_0, \alpha _{\mathrm{ML}}\}\) to match the observables \(\{o_k\} = \{L_{\mathrm{s}}, Z_{\mathrm{s}}/X_{\mathrm{s}}, R\}\) to the solar values \(\{o_k^\odot \} = \{L_{\mathrm{s, \odot }}, (Z_{\mathrm{s}}/X_{\mathrm{s}})_\odot , R_{\odot }\}\). (Specifically, R is here taken to be the photospheric radius, defined at the point in the model where \(T = T_{\mathrm{eff}}\), the effective temperature.) This is greatly simplified by the fact that variations in the parameters generally are fairly limited. Thus in practice the corrections \(\{\delta p_i\}\) to the parameters can be found from the errors in the observables, using a fixed set of derivatives, as

$$\begin{aligned} \delta p_i = \sum _k (o_k^\odot -o_k) {\partial p_i \over \partial o_k}, \end{aligned}$$
(37)

where the derivatives \(\{\partial p_i / \partial o_k\}\) are obtained by varying the parameters in turn and inverting the resulting derivative matrix \(\{\partial o_k / \partial p_i\}\). I have found that the following values secure relatively rapid convergence of the iteration:

$$\begin{aligned} \begin{array}{ccc} \displaystyle {\partial \ln \alpha _{\mathrm{ML}}\over \partial \ln L_{\mathrm{s}}} = 1.15 &{} \displaystyle {\partial \ln \alpha _{\mathrm{ML}}\over \partial \ln R} = -4.70 &{} \displaystyle {\partial \ln \alpha _{\mathrm{ML}}\over \partial \ln (Z_{\mathrm{s}}/X_{\mathrm{s}})} = 0.148 \\ \displaystyle {\partial \ln X_0 \over \partial \ln L_{\mathrm{s}}} = -0.137 &{} \displaystyle {\partial \ln X_0 \over \partial \ln R} = -0.087 &{} \displaystyle {\partial \ln X_0 \over \partial \ln (Z_{\mathrm{s}}/X_{\mathrm{s}})} = -0.132 \\ \displaystyle {\partial \ln Z_0 \over \partial \ln L_{\mathrm{s}}} = -0.111 &{} \displaystyle {\partial \ln Z_0 \over \partial \ln R} = 0.275 &{} \displaystyle {\partial \ln Z_0 \over \partial \ln (Z_{\mathrm{s}}/X_{\mathrm{s}})} = 0.864. \end{array} \end{aligned}$$
(38)

These derivatives are incorporated in the ASTEC code (Christensen-Dalsgaard 2008) and allow efficient and automatic calculation of calibrated solar models. In the case where no iteration for \(Z_0\) is carried out the following values have been used:

$$\begin{aligned} \begin{array}{cc} \displaystyle {\partial \ln \alpha _{\mathrm{ML}}\over \partial \ln L_{\mathrm{s}}} = 1.17 &{} \displaystyle {\partial \ln \alpha _{\mathrm{ML}}\over \partial \ln R} = -4.75 \\ \displaystyle {\partial \ln X_0 \over \partial \ln L_{\mathrm{s}}} = -0.154 &{} \displaystyle {\partial \ln X_0 \over \partial \ln R} = -0.045. \end{array} \end{aligned}$$
(39)

Convergence to a relative precision of \(10^{-7}\) is typically obtained in 5–7 iterations.

The evolution of the Sun

To set the scene for this brief overview of solar evolution it is useful to recall the characteristic timescales of stars. Departure from hydrostatic equilibrium causes motion on a dynamical timescale, of order

$$\begin{aligned} t_{\mathrm{dyn}} = \left( {R^3 \over G M} \right) ^{1/2} \simeq 30 \, \mathrm{min} \left( {R \over \,R_\odot }\right) ^{3/2} \left( {M \over \,M_\odot }\right) ^{-1/2}. \end{aligned}$$
(40)

Evolution in phases where the energy is provided by release of gravitational energy happens on the Kelvin–Helmholz timescale, of order

$$\begin{aligned} t_{\mathrm{KH}} = {G M^2 \over L R} \simeq 3 \times 10^7 \,\mathrm{year}\left( {M \over \,M_\odot } \right) ^2 \left( {R \over \,R_\odot }\right) ^{-1} \left( {L_{\mathrm{s}}\over \,L_\odot }\right) ^{-1}. \end{aligned}$$
(41)

As a result of the virial theorem (e.g., Kippenhahn et al. 2012) this is also the timescale for the cooling of the star as a result of loss of thermal energy. Finally, the timescale for nuclear burning on the main sequence can be estimated as

$$\begin{aligned} t_{\mathrm{nuc}} = {Q_{\mathrm{H}} q_{\mathrm{c}} X_0 M \over L} \simeq 10^{10} \,\mathrm{year}{M \over \,M_\odot } \left( {L_{\mathrm{s}}\over \,L_\odot }\right) ^{-1}, \end{aligned}$$
(42)

where \(Q_{\mathrm{H}}\) is the energy released per unit mass of consumed hydrogen and \(q_{\mathrm{c}} \simeq 0.1\) is the fraction of stellar mass that is involved in nuclear burning on the main sequence. Later stages of hydrogen burning typically involve smaller fractions of the mass and take place at higher luminosity and consequently have shorter duration; also, the burning of elements heavier than hydrogen release far less energy per unit mass and the corresponding phases are therefore also relatively short.

Pre-main-sequence evolution

Stars, including the Sun, are born from the collapse of gas and dust in dense and cold molecular clouds. Brief reviews of star formation were provided by, for example, Lada and Shu (1990) and Stahler (1994); for an extensive review, see McKee and Ostriker (2007). The collapse is triggered by gravitational instabilities, likely through turbulence which may have been induced by supernova explosions (Padoan et al. 2016). Detailed simulations by Li et al. (2018) of star formation in externally driven turbulence successfully reproduced the common filamentary structure of interstellar clouds and the statistical properties of newly formed stellar systems. Evidence for the presence at the birth of the solar system of a nearby supernova, which may have contributed to the dynamics leading to the formation of the Sun, is provided by decay products of short-lived radioactive nuclides found in meteorites (e.g., Goswami and Vanhala 2000; Goodson et al. 2016), allowing a remarkably precise dating of different components of the early solar system (Connelly et al. 2012). Further diagnostics of the early history of the solar system is provided by the ratios of oxygen isotopes (Gounelle and Meibom 2007); in situ measurements of the solar wind by the Genesis spacecraft appear to have further complicated the picture (Gaidos et al. 2009). A detailed review of the environment of solar-system formation was given by Adams (2010).

The collapse of the cloud results in the formation of a core which subsequently accretes matter from the surrounding cloud; detailed simulations of these early phases of stellar evolution have been carried out by, for example, Baraffe et al. (2009). The angular momentum of the infalling material probably leads to the formation of a disk around the star while processes likely involving magnetic fields often result in outflow from the proto-star in highly collimated jets along the rotation axis (Shu et al. 2000), giving rise to the so-called Herbig–Haro objects (e.g., Reipurth and Bally 2001). The gravitational energy released in the contraction of the protostar partly goes to heating it up and is partly released as radiation from the star; the radiation finally stops the accretion and blows away the surrounding material, such that the star becomes directly observable: the star has reached the ‘birth line’.

In these early phases matter in the protostar is relatively cool, leading to a high opacity, and the luminosity is rather large. Consequently, models of the star in this phase are generally fully convective, evolving down the so-called Hayashi line (Hayashi and Hoshi 1961) with contraction at roughly constant effective temperature, and material in the star is fully mixed. In this phase the temperature in the core reaches a point where deuterium burning can take place, but since the initial deuterium content is tiny (around \(1.6 \times 10^{-5}\) of the hydrogen abundance), the energy release has little effect on the evolution. With further contraction the temperature in the central parts of the star becomes so high that convection ceases and the star develops a gradually growing central radiative region. In this initial contraction, where energy for the luminosity and the heating of stellar material is provided by release of gravitational energy, evolution takes place on the Kelvin–Helmholz timescale (cf. Eq. 41), along the so-called Henyey line (Henyey et al. 1955) at increasing effective temperature and luminosity. With the beginning onset of substantial nuclear energy release, readjustments of the structure of the star lead to a reduction in luminosity, and the star settles on the zero-age main sequence (ZAMS). These early evolutionary phases are illustrated in Fig. 13. An extensive description of star formation, although possibly not completely up to date, was given by Stahler and Palla (2004).

Fig. 13
figure13

(Adapted from Aerts et al. (2010); data courtesy of A. Miglio.)

Pre-main-sequence evolution of stars with masses between 0.8 and \(2 \,M_\odot \), as indicated, computed with the Liège stellar evolution code CLÉS (Scuflaire et al. 2008). The composition is \(X = 0.7\), \(Z = 0.02\). The crosses mark the age along the tracks, in steps of 1 Myr; the ages at the end of the tracks range from 87 Myr at \(0.8 \,M_\odot \) to 32 Myr at \(1.4 \,M_\odot \). The heavy dotted line is a sketch of the so-called birth line, as shown by Palla and Stahler (1993), where the star emerges in visible light from the material left over from its formation.

Interestingly, this somewhat simplistic picture has been questioned by more detailed modelling of the contraction phase, starting from the initial collapsing cloud. Wuchterl and Klessen (2001) and Wuchterl and Tscharnuter (2003) solved the spherically symmetric equations of radiation hydrodynamics, starting from a suitable isothermal model of the original cloud and following the formation of an optically thick protostellar core and the accretion of further matter on this core. They found that deuterium burning takes place during the accretion phase and that the model retains a substantial radiative core throughout the evolution; the later phases of the contraction are parallel to the fully convective Hayashi track, but at somewhat higher effective temperature. These calculations were criticized by Baraffe and Chabrier (2010) on the grounds of the assumed spherical symmetry of the infall. However, by considering episodic infall Baraffe et al. also found models with an early radiative core. Detailed 3D modelling of collapsing molecular clouds, coupled with spherically symmetric modelling of the resulting proto-stellar and pre-main-sequence evolution (Kuffmeier et al. 2018; Jensen and Haugbølle 2018) has confirmed the episodic nature of the accretion. Also, interestingly, the results provide a plausible explanation for the observed properties of young stellar clusters.

As discussed in Sect. 7.1 the detailed pre-main-sequence evolution could have important consequences for the interpretation of the present solar surface composition. Given the importance of rotation and disk formation departures from spherical symmetry in the evolution of the star should clearly be taken into account in the modelling.

At the end of pre-main-sequence evolution, the temperature reaches a level where the full set of reactions in the PP chains (see Eqs. 24 and 25) sets in, supplying the energy lost from the stellar surface. At this point the contraction stops and the star enters its main-sequence evolution, with a balance between the nuclear energy generation and the energy loss from the surface, and hence taking place on a nuclear timescale.

It is likely that the early contraction, and the accretion of matter in the disk, leads to an initial rapid rotation of the star. In fact, it is observed that young stars generally rotate much more rapidly than the present Sun. However, in young open clusters where the stars may be assumed to share the same age substantial scatter in the rotation rates is found (e.g., Soderblom et al. 2001). This is a strong indication of the complex processes controlling the evolution of angular momentum in the initial phases of proto-stellar evolution, involving interactions between the star, the accreting disk and the outflows, likely of magnetic origin (Shu et al. 1994; Bodenheimer 1995), including magnetic locking between the outer layers of the star and the inner parts of a truncated accretion disk.

Disks are commonly observed around protostars, confirming also this part of the description (e.g., Greaves 2005; Williams and Cieza 2011). The ubiquitous presence of planetary systems around other stars (Batalha 2014; Winn and Fabrycky 2015) strongly suggests that the formation of planets in such protoplanetary disks is a common phenomenon. This likely takes place through the formation and subsequent coalescence of dust grains into objects of increasing size, and finally the formation of a planetary system (Lissauer 1993; Alibert et al. 2005; Montmerle et al. 2006; Johansen and Lambrechts 2017). Detailed discussions of the properties of such disks and the formation of planets were provided by Armitage (2011, 2017). Dramatic illustrations of these planet-forming processes have been obtained with the Atacama Large Millimeter/submillimeter Array (ALMA) high-resolution observations (e.g., ALMA Partnership et al. 2015; Isella et al. 2016; Harsono et al. 2018). An example is illustrated in Fig. 14; modelling by Dipierro et al. (2015) showed that the observed gaps are indeed consistent with the presence of newly formed planets. The planet-forming processes probably happen on a timescale comparable with, or shorter than, the gravitational contraction of the star. Thus the ages of meteorites as determined from radioactive dating likely provide good measures of the age of the Sun since it arrived on the main sequence.

Fig. 14
figure14

Image reproduced with permission from ALMA Partnership et al. (2015), copyright by AAS

ALMA observations, at a wavelength of 1 mm, of the planet-forming disk around the young star HL TaU. The lower-left inset shows the resolution.

Main-sequence evolution

The evolution after the arrival on the main sequence, past the present age of the Sun, is illustrated in Fig. 15. This is based on a model corresponding to Model S of Christensen-Dalsgaard et al. (1996), discussed in more detail in Sect. 4.1. Additional information about the variation with time of key quantities, normalized to values for the present Sun, is provided in Fig. 16. The evolution is obviously driven by the gradual conversion of hydrogen into helium in the core, leading to an increase in the mean molecular weight of matter in the core. This leads to a contraction of the core, an increase in the central density and temperature and, in accordance with Eq. (36), to an increase in the luminosity. This evolution may be understood in simple terms by noting, from Eq. (12), that the increase in \(\mu \) would cause a decrease in pressure inconsistent with hydrostatic balance, unless compensated for by an increase in \(\rho \) and T resulting from the contraction of the core. The increase in temperature, although partly counteracted by the decrease in X, leads to an increase in the energy-generation rate and, more importantly, to an increase in the radiative conductivity, and hence to the increase in the luminosity. Thus this effect is basic to the main-sequence evolution of stars; unless non-standard effects (such as mass loss; see Sect. 6.5) are relevant there is hardly any doubt that the solar luminosity has undergone a fairly substantial increase since the formation of the solar system. A detailed analysis of this behaviour, in terms of homology scaling, was provided by Gough (1990b).

Fig. 15
figure15

Evolution track in the Herzsprung–Russell diagram of a model sequence passing through Model S of the present Sun (Christensen-Dalsgaard et al. 1996, see also Sect. 4). Diamonds mark models separated by \(1 \,\mathrm{Gyr}\) in age, and after an age of \(10 \,\mathrm{Gyr}\) plus symbols are at intervals of \(0.1 \,\mathrm{Gyr}\). The Sun symbol (\(\odot \)) indicates the location of the present Sun and the star shows the point where hydrogen has been exhausted at the centre

Such a change in the solar energy reaching the Earth might be expected to have climatic effects; in fact, a naive estimate based on black-body radiative balance indicates that the change of 30% in solar luminosity shown in Fig. 16 would cause a change of around 7% in the surface temperature of the Earth, i.e., around 20 K. Thus one might expect that the Earth was very substantially colder early in its history. In fact, already Schwarzschild et al. (1957) noted that, since in their calculations the solar luminosity was about 20% less than now two billion years ago “[t]he average temperature on the earth’s surface must then have been just about at the freezing point of water, if we assume that it changes proportionally to the fourth root of the solar luminosity. Would such a low average temperature have been too cool for the algae known to have lived at that time?” In contrast to these models, the terrestrial surface temperature shows no indication of dramatic changes over the past 4 Gyr, with evidence for liquid water in even very old geological material (Mojzsis et al. 2001; Wilde et al. 2001; Rosing and Frei 2004). This problem has been dubbed ‘the faint early Sun problem’ (see also Güdel 2007), and led to speculations about errors in our understanding of stellar evolution. It seems more likely, however, that the problem lies in the simplistic climate models used for these estimates of the temperature of the early Earth (e.g., Sagan and Mullen 1972). With a substantially stronger early greenhouse effect, perhaps caused by a higher content of \(\mathrm{CO_2}\), the present temperature could have been reached with a lower energy input. Modelling of the early terrestrial atmosphere by von Paris et al. (2008) suggested that the required abundances of greenhouse gasses may be consistent with geological evidence. This was questioned by Rosing et al. (2010) who suggested that the dominant effect was a reduced cloud cover and hence lower terrestrial mean albedo than at present, resulting in a fainter Sun providing sufficient heating to achieve the required surface temperature on Earth. Shaviv (2003) and Svensmark (2006) noted that modulation of galactic cosmic rays by an initially stronger solar wind could have contributed to the warming of the early Earth, by similarly reducing the cloud cover. Variations with time of solar activity and their possibly effects on planetary atmospheres were also discussed by Güdel (2007). There remains the problem of explaining the apparent stability of Earth’s temperature despite the variation in solar luminosity. Various feedback mechanisms of a geological nature have been proposed that may account for this (e.g., Walker et al. 1981), involving climate-dependent weathering of rocks and \(\mathrm{CO_2}\) outgassing from volcanoes; a detailed review of these processes was provided by Kump et al. (2000).Footnote 38 A comprehensive review of the ‘faint early Sun problem’ was provided by Feulner (2012).

Fig. 16
figure16

Variation with age of quantities, normalized to the value at the present age of the Sun, in a \(1 \,M_\odot \) evolution sequence, including Model S of the present Sun (see Sect. 4.1). The top panel shows the evolution up to just after the present age, whereas the bottom panel continues the evolution beyond the exhaustion of hydrogen at the centre. Line styles and colours are indicated in the figure. R and \(L_{\mathrm{s}}\) are photospheric radius and surface luminosity, \(d_{\mathrm{cz}}\) is the depth of the convective envelope, in units of the surface radius, and \(T_{\mathrm{c}}\), \(X_{\mathrm{c}}\), \(\rho _{\mathrm{c}}\), \(\epsilon _{\mathrm{c}}\) and \(\kappa _{\mathrm{c}}\) are central temperature, hydrogen abundance, density, energy-generation rate and opacity. Values in the present Sun for most of the quantities are given in Table 2; in addition, \(\epsilon _{\mathrm{c}} = 17.06\,\,\mathrm{erg}\,\,\mathrm{g}^{-1}\,\,\mathrm{s}^{-1}\) and \(\kappa _{\mathrm{c}} = 1.242\,\,\mathrm{cm}^2\,\,\mathrm{g}^{-1}\). At the end of the illustrated part of the evolution, the ratio \(\rho _{\mathrm{c}}/\rho _{\mathrm{c,\odot }}\) is around 340, corresponding to a central density of \(5.3 \times 10^4\,\,\mathrm{g}\,\,\mathrm{cm}^{-3}\)

Beyond the present Sun the increase in luminosity continues, as is evident from Figs. 15 and 16. Also, the radius increases monotonically during the central hydrogen burning. The evolution of the hydrogen-abundance profile is illustrated in Fig. 17. The nuclear reactions cause a gradual reduction of the hydrogen in the core, whereas helium settling, although fairly weak in the Sun, gives rise to an increase in the hydrogen abundance in the convection zone and the formation of a fairly sharp composition gradient at its base. When hydrogen is exhausted at the centre there is a gradual transition to hydrogen burning in a shell around a core consisting predominantly of helium; the core gradually grows in mass and contracts, leading to high central densities and a substantial degree of degeneracy, while the hydrogen-burning shell becomes quite thin. This enhances the increase in the stellar radius: for reasons that are not entirely understood (see, however, Faulkner 2004) the contraction of the core inside a burning shell leads to expansion of the region outside the shell. The resulting strong expansion of the stellar surface radius leads to a decrease of the effective temperature and strong increase in the depth of the convective envelope. The evolution initially takes place at nearly constant luminosity, on the so-called subgiant branch. Eventually, the star reaches a structure that, in terms of distance to the centre, is nearly fully convective, apart from a radiative core of very small radial extent; as a result, the star evolves towards higher luminosity with the increase in radius, parallel and close to the Hayashi track. At the final point illustrated in Fig. 16 the convective envelope extends over 68% of the mass, and 79% of the radius, of the model. As shown in Fig. 17 the resulting mixing with layers previously enriched in helium by settling leads to a reduction in the surface hydrogen abundance.

Fig. 17
figure17

Hydrogen abundance X against fractional mass m/M for a zero-age main-sequence model (dotted line), a model of age \(4.6 \,\mathrm{Gyr}\) (present Sun; solid line), a model of age \(9.5 \,\mathrm{Gyr}\), where hydrogen has just been exhausted at the centre (dashed line) and the model of age \(11.5 \,\mathrm{Gyr}\), the final model included in Fig. 15 (dot-dashed line). In the latter model the radiative core containing 32% of the stellar mass occupies only 21% of the stellar radius. The evolution sequence corresponds to Model S of Christensen-Dalsgaard et al. (1996, see Sect. 4.1)

For stars from slightly above solar mass and below there is a systematic decrease in the rotation rate with increasing age as the stars evolve on the main sequence; for stars of solar mass (Skumanich 1972; Barnes 2003); this is assumed to result from the loss of angular momentum in a magnetized stellar wind (e.g., Kawaler 1998; Matt et al. 2015), presumably related to the generation of magnetic activity through dynamo action, as inferred in the Sun (for a review, see Charbonneau 2010). Regardless of the substantial spread in early rotation rates, these processes tend to lead to a well-defined rotation rate as a function of age and mass, after an initial converging phase (e.g., Gallet and Bouvier 2013). This forms the basis for gyrochronology, i.e., the determination of ages of stars based on their rotation periods (e.g., Barnes 2010; Epstein and Pinsonneault 2014). The details of these processes, and of the subsequent redistribution of angular momentum in the stellar interior, are highly uncertain, however (Charbonneau and MacGregor 1993; Gough and McIntyre 1998; Talon and Charbonnel 2003; Charbonnel and Talon 2005; Eggenberger et al. 2005). In the solar case the result of the angular-momentum loss and redistribution, as determined from helioseismology, is a nearly spatially unvarying rotation in the radiative interior, at a rate slightly below the equatorial surface rotation rate. These results, and their theoretical interpretation, are discussed in Sect. 5.1.4 in the light of helioseismic inferences of solar internal rotation. Interestingly, by combining asteroseismic determinations of stellar ages (cf. Sect. 7.2) with determinations of stellar rotation rates van Saders et al. (2016) indicated that the steady decrease of rotation rate with increasing age slows down for stars older than a few Gyr, indicating a weakening of the magnetic braking. This would complicate the use of gyrochronology for age determination of stars older than the Sun. However, I note that Barnes et al. (2016) questioned the analysis by van Saders et al. (2016).Footnote 39 Also, Lorenzo-Oliveira et al. (2020) inferred a rotation rate matching the expectations for normal spin-down for the solar twin HIP 102152, with an age of 8 Gyr inferred from isochrone fitting; however, there may be some question about the precision of the age and the modelling of the spin-down (van Saders, private communication). Thus further work is clearly required to define the limits of applicability of gyrochronology.

Late evolutionary stages

The later evolution of stars of solar mass is discussed in detail by Kippenhahn et al. (2012). The specific case of the Sun was considered by, for example, Jørgensen (1991) and Sackmann et al. (1993). With continuing core contraction and expansion of the envelope the star moves up along the Hayashi track as a red giant, reaching a luminosity of more than \(2000 \,L_\odot \) (for a review of red-giant evolution, see Salaris et al. 2002); needless to say, this is incompatible with life on Earth. The helium core heats up, partly as a result of the contraction and partly through heating from the hydrogen-burning shell whose temperature is forced to increase to match the energy required by the increasing luminosity. When the core reaches a temperature of around \(80 \times 10^6 \,\mathrm{K}\) helium burning starts, in the triple-alpha reaction producing \({}^{12}\mathrm{C}\). Since the core is strongly degenerate the pressure is essentially independent of temperature; thus the heating associated with helium ignition initially has no effect on the pressure and the burning takes place in a run-away process, a helium flash, where the core luminosity exceeds \(10^{10} \,L_\odot \) for several hours. However, the energy released is absorbed as gravitational energy in expanding the inner parts of the star; together with a decrease in the energy production from the hydrogen shell-burning, this results in a drop of the surface luminosity. Detailed calculations of the complex evolution through this phase have been carried out by, for example, Schlattl et al. (2001) and Cassisi et al. (2003b), and are also possible in the general-purpose MESA stellar evolution code (Paxton et al. 2011). Hydrodynamical simulations in two and three dimensions of the evolution during the flash were made by Mocák et al. (2008, 2009), confirming the importance of core convection in carrying away the energy generated during the flash. Only when degeneracy is lifted by the increase in temperature and decrease in density does the core expand and nuclear burning stabilizes in a phase of quiet core helium burning; in addition to the triple-alpha reaction, \({}^{16}\mathrm{O}\) is produced from \({}^4\mathrm{He}+{}^{12}\mathrm{C}\). When helium is exhausted in the core the star again ascends along the Hayashi track, on the asymptotic giant branch. Here the star enters the so-called thermally pulsing phase where helium repeatedly ignites in helium flashes in a shell around the degenerate carbon-oxygen core, after which evolution settles down again over a timescale of a few thousand years (e.g., Herwig 2005). Finally, the star sheds its envelope through rapid mass loss (e.g., Willson 2000; Miller Bertolami 2016), leaving behind a hot and compact core consisting predominantly of carbon and oxygen. The Sun is expected to reach this point in its evolution at an age of around 12.4 Gyr, 7.8 Gyr from now. The ejected material may shine due to the excitation from the ultraviolet light emitted by the core, as a planetary nebula which quickly disperses, with a lifetime of typically of order 10,000 years (e.g., Gesicki et al. 2018). The core contracts and cools over a very extended period as a white dwarf, from its initial surface temperature of more than \(10^5 \,\mathrm{K}\), reaching a surface temperature of \(4000 \,\mathrm{K}\) only after a further 10 Gyr.

The details of this evolution are still somewhat uncertain, depending in particular on the extent of mass loss in the red-giant phases, and on exotic processes that may cool the core and delay helium ignition. An uncertain issue of some practical importance is whether the solar radius at any point reaches a size such as to engulf the Earth, taking into account also the possible increase in the size of the Earth’s orbit resulting from mass loss from the Sun; this depends in part on the variation of the radius during the final thermal pulses. In a detailed analysis of the evolutionary scenarios, Rybicki and Denis (2001) concluded that ‘it seems probable that the Earth will be evaporated inside the Sun’. This was confirmed by more recent calculations by Schröder and Smith (2008), taking into account tidal interactions between the planet and the expanding Sun and dynamical drag in the solar atmosphere, as well as the compensating effects of solar mass loss and their influence on the orbit of the planet. According to their results, planets with a present distance from the Sun of less than around 1.15 AU would be engulfed when the Sun reaches the tip of the red-giant branch.

It is obvious that the continued increase of solar luminosity, even on the main sequence, will have had catastrophic climatic consequences long before this point is reached. Already Lovelock and Whitfield (1982) noted that the increase over only 150 million years would be larger than could be compensated for by a decreasing greenhouse effect caused by a decrease in the atmospheric \(\mathrm{CO_2}\) content, to the minimum level required for photosynthesis. In an interesting, if somewhat speculative, analysis Korycansky et al. (2001) pointed out the possibility of compensating for the increase in solar luminosity by increasing the size of the Earth’s orbit through engineering repeated, although infrequent, carefully controlled encounters with a substantial asteroid. It seems unlikely, however, that such a change could be rapid enough to negate the effect of the increase of the solar luminosity on the red-giant branch. Furthermore, it is hardly necessary to point out that the Earth may face more imminent threats to the climate as a result of the antropogenic effects on the composition of the atmosphere (e.g., Crowley 2000; Solomon et al. 2009; Cubasch et al. 2013).

‘Standard’ solar models

As discussed in Sect. 2.1, the concept of ‘standard solar model’ has evolved greatly over the years; the term goes back at least to Bahcall et al. (1969) who introduced it in connection with calculations of the solar neutrino flux. It may now be taken to be a spherically symmetric model, including a relatively simple treatment of diffusion and gravitational settling, up-to-date equation of state, opacity and nuclear reactions, and a simple treatment of near-surface convection. Other potential hydrodynamical effects, including mixing processes in the radiative interior and the effects of rotation and its evolution, are ignored. The evolution of the concept can be followed in several sets of solar evolution calculations, often motivated by the solar neutrino problem (see Sect. 5.2) and, more recently, by the availability of detailed helioseismic constraints (see Sect. 5.1). An impressive example are the efforts of John Bahcall over an extended period. As reviewed by Bahcall (1989) early models did not include diffusion (e.g., Bahcall and Shaviv 1968; Bahcall and Ulrich 1988). Bahcall and Pinsonneault (1992b) included diffusion of helium, whereas later models (e.g., Bahcall and Pinsonneault 1995; Bahcall et al. 2006) included diffusion of both helium and heavier elements. Other examples of standard model computations are Turck-Chièze et al. (1988), Cox et al. (1989), Guenther et al. (1992), Berthomieu et al. (1993), Turck-Chièze and Lopes (1993), Gabriel (1994, 1997), Chaboyer et al. (1995), Guenther et al. (1996), Richard et al. (1996), Schlattl et al. (1997), Brun et al. (1998), Elliott (1998), Morel et al. (1999), Neuforge-Verheecke et al. (2001a) and Serenelli et al. (2011). A recent comprehensive recomputation of solar models was carried out by Vinyoles et al. (2017), discussed in more detail in Sect. 6.4. A brief review of standard solar modelling was provided by Serenelli (2016).

As representative of standard models I here consider the so-called Model S of Christensen-Dalsgaard et al. (1996); details on the model calculation were provided by Christensen-Dalsgaard (2008). Although more than two decades old, and to some extent based on out-dated physics, it is still seeing substantial use for a variety of applications, including as reference for helioseismic inversions. Thus it provides a useful reference for discussing the effects of various updates to the model physics. Remarkably, as discussed in Sects. 5.1.2 and 5.2, such simple models are in reasonable agreement with observations of solar oscillations and neutrinos.

Model S

Model S was computed with the OPAL equation of state (Rogers et al. 1996) and the 1992 version of the OPAL opacities (Rogers and Iglesias 1992), with low-temperature opacities from Kurucz (1991).Footnote 40 Nuclear reaction parameters were generally obtained from Bahcall and Pinsonneault (1995), and electron screening was treated in the weak-screening approximation of Salpeter (1954). The computation was started from a static and chemically homogeneous zero-age main-sequence model, and the age of the present Sun, from that state, was assumed to be \(4.6 \,\mathrm{Gyr}\). The time evolution of the \({}^3\mathrm{He}\) abundance was followed, while the other reactions in the PP chains were assumed to be in nuclear equilibrium; to represent the pre-main-sequence evolution the initial \({}^3\mathrm{He}\) abundance was assumed to correspond to the evolution of the abundance at constant conditions for a period of \(5 \times 10^7 \,\mathrm{year}\), starting at zero abundance (see Christensen-Dalsgaard et al. 1974). Similarly, the CN part of the CNO cycle (cf. Eq. 26) was assumed to have reached nuclear equilibrium in the pre-main-sequence phase while the conversion of \({}^{16}\mathrm{O}\) into \({}^{14}\mathrm{N}\) was followed. The diffusion and settling of helium and heavy elements were computed in the approximation of Michaud and Proffitt (1993); the evolution of Z was computed neglecting the effect of nuclear reactions and representing \(D_i\) and \(V_i\) by the behaviour of fully ionized \({}^{16}\mathrm{O}\). Convection was treated in the Böhm-Vitense (1958) formalism. The atmospheric structure was computed using the VAL \(T(\tau )\) relation given by Eq. (31) and illustrated in Fig. 10. The initial composition was calibrated to obtain a present \(Z_{\mathrm{s}}/X_{\mathrm{s}}= 0.0245\) (Grevesse and Noels 1993), while the surface luminosity and radius were set to \(3.846 \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\) and \(6.9599 \times 10^{10} \,\mathrm{cm}\), respectively, to an accuracy of better than \(10^{-6}\) (see Sect. 2.2).

Some basic quantities of the model of the present Sun are given in Table 2 below, together with properties of other similar models, discussed in detail in Sect. 4.2. Also, Fig. 18 shows the variation of X and Z through the model. It is striking that the settling of helium and heavy elements causes sharp gradients in X and Z just below the convection zone. Details of the model structure are provided at https://github.com/jcd11/LRSP_models.

Fig. 18
figure18

Hydrogen abundance X (top panel) and heavy-element abundance (lower panel) against fractional radius, in a model (Model S of Christensen-Dalsgaard et al. 1996) of the present Sun. The inset in the upper panel shows the hydrogen-abundance profile in the vicinity of the base of the convective envelope. The horizontal dotted lines show the initial values \(X_0\) and \(Z_0\)

It is perhaps of some interest to compare the structure of this model with an early calibrated model of solar structure. In Fig. 19 Model S is compared with a \(1 \,M_\odot \) model computed by Weymann (1957), as quoted by Schwarzschild (1958); the model has solar radius and approximately solar luminosity at an age of \(4.5 \,\mathrm{Gyr}\). It is evident that the hydrogen profile differs substantially between the two models, in part owing to the inclusion of settling in Model S, but more importantly because the Weymann model is less evolved. On the other hand, on this scale temperature and pressure look quite similar between the two models. In fact, the central temperature and pressure differ by less than 10%, although there are differences of up to nearly 30% in temperature elsewhere in the model and even larger differences in pressure. Another significant difference is in the depth of the convective envelope which is around \(0.15 \,R_\odot \) in the Weymann model and 0.29 in Model S. Even so, given that Model S provides a reasonable representation of solar structure (see Sect. 5.1.2), it is evident that the early model succeeded in capturing important aspects of the structure of the Sun.

Fig. 19
figure19

Comparison of Model S of Christensen-Dalsgaard et al. (1996) (dashed curves) with a \(1 \,M_\odot \) model computed by Weymann (1957) (solid curves). The quantities illustrated are temperature T, in \(\,\mathrm{K}\) (top panel), pressure p, in \(\,\mathrm{dyn}\,\mathrm{cm}^{-2}\) (central panel) and hydrogen abundance X (bottom panel)

Sensitivity of the model to changes in physics or parameters

It is evident that the uncertainty in the input parameters, and physics, of the calculation introduces uncertainties in the model structure. A number of investigations have addressed aspects of these uncertainties. An early example is provided by Christensen-Dalsgaard (1988b) who considered several different changes to the model physics, analysing the effects on the model structure and the resulting oscillation frequencies. Remarkably, he found that the change to the structure was essentially linear in the change in opacity as represented by \(\log \kappa \), even for quite substantial changes. Such linearity in changes to \(\varGamma _1\) was also found by Christensen-Dalsgaard and Thompson (1991). Boothroyd and Sackmann (2003) considered a broad range of changes in the model parameters and physics, emphasizing comparisons with the helioseismically inferred sound speed obtained by Basu et al. (2000). A very ambitious investigation was carried out by Bahcall et al. (2006) who made a Monte Carlo simulation based on 10,000 models with random selections of 21 parameters characterizing the models, in this way assigning statistical properties to the computed model quantities, including detailed neutrino fluxes. It was demonstrated by Jørgensen and Christensen-Dalsgaard (2017) that, owing to the near linearity of the model response to changes in parameters (see also Bahcall and Serenelli 2005), this result could to a large extent be recovered much more economically by computing the relevant partial derivatives with respect to the model parameters; this opens the possibility for more extensive statistical analysis of this nature. A more systematic exploration of the linearity of the response of solar models was carried out by Villante and Ricci (2010) who linearized the equations of stellar structure in terms of various perturbations and, consistent with the numerical experiments discussed above, demonstrated that the resulting changes to the model closely matched the differences between models computed with the assumed perturbations.

Here I consider some examples of changes to the model parameters and physics, emphasizing the updates that have taken place since the original computation of Model S. When not specifically mentioned, the physical properties and parameters of the models are the same as for Model S (see also Table 1), which is also in most cases used as reference. An overview of the models considered is provided by Table 1, while Table 2 gives basic properties of the models, and Table 3 presents the differences between the modified models and Model S. To put the results in context, Fig. 20 shows the helioseismically inferred differenceFootnote 41 in squared sound speed between the Sun and Model S. Note that the statistical errors in the inferences are barely visible, compared with the size of the symbols. The helioseismic results on solar structure are discussed in detail in Sect. 5.1.2.

Table 1 Parameters of solar models
Table 2 Characteristics of the models in Table 1
Table 3 Differences between the model quantities in Table 2 and the corresponding properties of Model [S]
Fig. 20
figure20

(Adapted from Basu et al. 1997)

Results of helioseismic inversions. Inferred relative differences in squared sound speed between the Sun and Model S in the sense (Sun)–(model). The vertical bars show \(1\,\sigma \) errors in the inferred values, based on the errors in the observed frequencies. The horizontal bars provide a measure of the resolution of the inversion.

To interpret the results of such model comparisons, it is useful to note some simple properties of the solar convection zone (see also Gough 1984b; Christensen-Dalsgaard 1997; Christensen-Dalsgaard et al. 1992, 2005). Apart from the relatively thin ionization zones of hydrogen and helium, pressure and density in the convection zone are approximately related by Eq. (27), with \(\gamma = 5/3\); also, since the mass of the convection zone is only around \(0.025 \,M_\odot \) we can, as a first approximation, assume that \(m \simeq M\) in the convection zone. In this case it is easy to show thatFootnote 42

$$\begin{aligned} c^2 = {\gamma p \over \rho } \simeq (\gamma - 1) G M \left( {1 \over r} - {1 \over R} \right) . \end{aligned}$$
(43)

It follows that c is unchanged at fixed r between models with the same mass and surface radius. Also,

$$\begin{aligned} {\delta _r p \over p} = {\delta _r \rho \over \rho } \simeq - {1 \over \gamma - 1} {\delta K \over K}, \end{aligned}$$
(44)

where \(\delta _r\) denotes the difference between two models at fixed r, and \(\delta K\) is the difference in K between the models. Finally, assuming the ideal gas law, Eq. (12),

$$\begin{aligned} {\delta _r T \over T} \simeq {\delta _r \mu \over \mu }, \end{aligned}$$
(45)

which is obviously constant.

Since the effects of the changes are subtle, some care is required in specifying and computing the differences.Footnote 43 Here I consider differences (also denoted \(\delta _r\)) at fixed fractional radius r/R, where R is the photospheric radius. It should be noted, however, that Christensen-Dalsgaard and Thompson (1997) found differences \(\delta _m\) at fixed mass fraction m/M more illuminating for studies of the effects on oscillation frequencies of near-surface modifications to the model. Such differences are also more appropriate for studying evolutionary effects on stellar models. They showed that the two differences are related by

$$\begin{aligned} \delta _m f= & {} \delta _r f + \delta _m r {\mathrm{d}f \over \mathrm{d}r} \nonumber \\ \delta _r f= & {} \delta _m f + \delta _r m {\mathrm{d}f \over \mathrm{d}m}, \end{aligned}$$
(46)

for any model quantity f.

A prerequisite for sensible studies of solar models and their dependence on the physics is that adequate numerical precision is reached. I discuss this in the Appendix.

I first consider changes in the global parameters characterizing the model. Figure 21 shows the effect of decreasing the model age to the now generally accepted value of 4.57 Gyr (see Sect. 2.2), compared with the reference value of 4.6 Gyr in Model S. To match the solar luminosity at this lower age, a slightly smaller initial hydrogen abundance is required, increasing \(\mu \) (cf. Eq. 36); on the other hand, the increased central hydrogen abundance reflects the shorter time spent in hydrogen burning. As predicted above, the sound-speed difference is virtually zero in the convection zone, except in the ionization zones near the surface where the change results from the change in composition and the resulting change in \(\varGamma _1\). Also, \(\delta _r \ln p\) and \(\delta _r \ln \rho \) are nearly constant and nearly identical in the bulk of the convection zone (cf. Eq. 44) and the change in temperature reflects the change in the mean molecular weight.

Fig. 21
figure21

Model changes at fixed fractional radius resulting from a change in age, from the reference value of 4.6 Gyr used in Model S to Model [Age] with an age of 4.57 Gyr (see Sect. 2.2), in the sense (Model [Age])–(Model S). The line styles are defined in the figure. The thin dotted line marks zero change

A related issue concerns the neglect of pre-main-sequence evolution in Model S, where evolution starts from an essentially homogeneous zero-age main-sequence model. This was investigated by Morel et al. (2000) who found that, with a shift in the evolution by 25 Myr, the resulting calibrated solar models differed by only a few parts in \(10^4\). Thus the assumption of an initial ZAMS model is adequate.

The effects of changing the radius, from the reference value of \(6.9599 \times 10^{10} \,\mathrm{cm}\) to the value of \(6.95508 \times 10^{10} \,\mathrm{cm}\) found by Brown and Christensen-Dalsgaard (1998), is illustrated in Fig. 22a. Here there is obviously a change in the sound speed in the convection zone, and consequently \(\delta _r \ln p\) and \(\delta _r \ln \rho \), while still approximately constant in the convection zone, differ. Considering the changes in the radiative interior, the use of differences at fixed r/R is in fact somewhat misleading in this case. Much of the change shown in Fig. 22a is essentially a geometrical effect, corresponding to the gradient term in the second of Eqs. (46); the corresponding differences at fixed m (see Fig. 22b) become very small in the deep interior. As a result, the value of \(X_0\) required to calibrate the model is virtually unchanged.

Fig. 22
figure22

Model changes at fixed fractional radius (a) and fixed mass (b), resulting from a change in photospheric radius, from the reference value of \(6.9599 \times 10^{10} \,\mathrm{cm}\) used in Model S to the value of \(6.95508 \times 10^{10} \,\mathrm{cm}\) in the sense (Model [\(R_{\mathrm{s}}\)])–(Model S). Line styles are as defined in Fig. 21

As illustrated in Fig. 23 the change in luminosity from the reference value of \(3.846 \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\) to the value \(3.828 \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\) inferred from Kopp et al. (2016) has modest effects on the structure. According to Eq. (36) the calibration to lower luminosity requires a decrease in \(\mu \) and hence an increase in X, accompanied by a decrease in temperature, which is evident in the figure. In the central regions the lower luminosity also corresponds to a smaller nuclear burning of hydrogen and hence a larger abundance. The difference in sound speed is minute.

Fig. 23
figure23

Model changes at fixed fractional radius resulting from a change in surface luminosity, from the reference value of \(3.846 \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\) used in Model S to the value of \(3.828 \times 10^{33} \,\mathrm{erg}\,\mathrm{s}^{-1}\) adopted by Mamajek et al. (2015) as the nominal solar luminosity, in the sense (Model [\(L_{\mathrm{s}}\)])–(Model S). Line styles are as defined in Fig. 21

As discussed in Sect. 2.3.1 the OPAL equation of state has been substantially updated since the computation of Model S. Figure 24 compares a model computed with the up-to-date OPAL 2005 version with Model S. The effects in the bulk of the model are rather modest, with somewhat larger changes in the near-surface layers. A significant failing in the earlier tables was the neglect of relativistic effects on the electrons in the central regions, which have a significant effect on \(\varGamma _1\) (see also Eq. 20). This in fact dominates the sound-speed difference in the deeper parts of the model in Fig. 24.

Fig. 24
figure24

Model changes at fixed fractional radius, between Model [Liv05] using the OPAL 2005 equation of state and Model S, in the sense (Model [Liv05])–(Model S). Line styles are as defined in Fig. 21, with the addition of the dotted green line showing \(\delta _r \varGamma _1/\varGamma _1\)

Perhaps the most uncertain aspect of the stellar internal microphysics is the opacity (see also Sects. 2.3.2, 6.4). Tripathy and Christensen-Dalsgaard (1998) made a detailed investigation of the effect on calibrated solar models of localized modifications to the opacity. They replaced \(\log \kappa \), \(\log \) being logarithm to base 10, by \(\log \kappa + \delta \log \kappa \), where

$$\begin{aligned} \delta \log \kappa = A_\kappa \exp [ -(\log T - \log T_\kappa )^2/\varDelta _\kappa ^2], \end{aligned}$$
(47)

for a range of \(\log T_\kappa \). They also demonstrated a nearly linear response for even fairly large modifications, by changing \(A_\kappa \) from 0.1 to 0.2. The response of solar models to opacity changes was also investigated by Villante and Ricci (2010). As examples, Fig. 25 shows the changes to the model resulting from opacity changes of the form given in Eq. (47) at \(\log T_\kappa = 7\) and 6.5. It is striking that the changes in temperature and hence sound speed are largely localized in the vicinity of the opacity change, with a somewhat broader response of pressure and density. For the deeper opacity change a modest change in the hydrogen abundance is required to calibrate the model to the correct luminosity: the increase in opacity would tend to reduce the luminosity and this is compensated by a decrease in X and hence an increase in \(\mu \), in accordance with the homology scaling in Eq. (36).

Fig. 25
figure25

Model changes at fixed fractional radius, resulting from localized changes to the opacity described by Eq. (47) with \(A_\kappa = 0.02, \varDelta _\kappa = 0.02\), in the sense (modified model)–(Model S). The top panel shows results for Model [Opc. 7.0], with \(\log T_\kappa = 7\), and the bottom panel results for Model [Opc. 6.5], with \(\log T_\kappa = 6.5\). Results are shown as a function of fractional radius (bottom abscissa) and \(\log T\) (top abscissa), and the line styles are defined in the figure

The behaviour of \(\delta _r \ln T\) can be understood from the equation for the temperature gradient (Eqs. 45) which we write as

$$\begin{aligned} {\mathrm{d}\ln T \over \mathrm{d}r} = - {3 \over 4 a {\tilde{c}}} {\kappa \rho \over T^4} {L(r) \over 4 \pi r^2}, \end{aligned}$$
(48)

or

$$\begin{aligned} {\mathrm{d}\over \mathrm{d}r} \left( {\delta _r T \over T} \right) = - {3 \over 4 a {\tilde{c}}} {\kappa \rho \over T^4} {L(r) \over 4 \pi r^2} \left( {\delta _r \kappa \over \kappa } + {\delta _r \rho \over \rho } + 4 {\delta _r T \over T} \right) , \end{aligned}$$
(49)

where I neglected the perturbation to L. We write \(\delta _r \kappa / \kappa = (\delta \kappa /\kappa )_{\mathrm{int}} + \kappa _T \delta _r T/T\), where \( (\delta \kappa /\kappa )_{\mathrm{int}}\) is the intrinsic opacity change given by Eq. (47), \(\kappa _T = (\partial \ln \kappa / \partial \ln T)_{\rho , X_i}\) and I neglected the dependence of \(\kappa \) on \(\rho \) and composition. Then Eq. (49) can be written as

$$\begin{aligned} {\mathrm{d}\over \mathrm{d}\ln T} \left( {\delta _r T \over T} \right) + (4 - \kappa _T) {\delta _r T \over T} = \left( {\delta \kappa \over \kappa } \right) _{\mathrm{int}}, \end{aligned}$$
(50)

neglecting again \(\delta _r \rho / \rho \). In the outer parts of the Sun the temperature is largely fixed, for small changes in X, by Eq. (45). Assuming that \(\delta _r T/T \approx 0\) well outside the location \(T = T_\kappa \) of the change in the opacity, and taking \(\kappa _T\) as constant, Eq. (50) has the solution

$$\begin{aligned} {\delta _r T \over T} \approx T^{-(4 - \kappa _T)} \int _{\ln T_{\mathrm{s}}}^{\ln T} T'^{4 - \kappa _T} \left( {\delta \kappa \over \kappa } \right) _{\mathrm{int}} \mathrm{d}\ln T', \end{aligned}$$
(51)

where \(T_{\mathrm{s}}\) is the surface temperature. This explains the steep rise in the outer parts of the peak in \(\delta _r T/T\) (and hence \(\delta _r c^2/c^2\)) and, with \(\kappa _T\) typically around \(-2\) to \(-3\), the relatively rapid decay on the inner side.

To analyse the properties of \(\delta _r p\) and \(\delta _r \rho \) I assume the ideal gas law, Eq. (12), and neglect the change in the mean molecular weight, such that \(\delta _r \ln \rho \approx \delta _r \ln p - \delta _r \ln T\). From the Eq. (1) of hydrostatic equilibrium, neglecting the change in m, it then follows that

$$\begin{aligned} {\mathrm{d}\delta _r \ln p \over \mathrm{d}\ln p} \approx - \delta _r \ln T. \end{aligned}$$
(52)

Below the location of the opacity and temperature change pressure and density are relatively unaffected. Thus the local change in pressure is dominated by the increase with increasing r in the peak of \(\delta _r \ln T\), while \(\delta _r \ln \rho \) has a negative dip in this region, but follows the increase in \(\delta _r \ln p\) outside it. The global behaviour of \(\delta _r \ln p\) and \(\delta _r \ln \rho \) is constrained by the conservation of total mass, such that

$$\begin{aligned} \int _0^R \delta _r \rho r^2 \mathrm{d}r = 0. \end{aligned}$$
(53)

For \(\log T_\kappa = 7\) (top panel in Fig. 25) the region of positive \(\delta _r \ln \rho \) just outside the peak in \(\delta _r \ln T\) therefore forces a region of negative \(\delta _r \ln \rho \) in the outer parts of the model, including the convection zone where \(\delta _r \ln p\) and \(\delta _r \ln \rho \), according to Eq. (44), are approximately constant. For \(\log T_\kappa = 6.5\) (bottom panel) the region of negative \(\delta _r \ln \rho \) in the peak of \(\delta _r \ln T\) results in positive \(\delta _r \ln \rho \) and \(\delta _r \ln p\) in the convection zone. The effect on the hydrogen abundance is less clear in simple terms, although it must be related to the calibration to keep the luminosity fixed. Given that the changes in the deep interior are minute for \(\log T_\kappa = 6.5\), it is understandable that \(\delta _r X\) is very small in this case, except in the region just below the convection zone that is directly affected by changes in diffusion and settling.

The OPAL opacity tables were updated by Iglesias and Rogers (1996), relative to the Rogers and Iglesias (1992) tables used for Model S. As shown in Fig. 26, comparing models that both assumed the Grevesse and Noels (1993) solar composition but using respectively the OPAL96 and OPAL92 tables, the revision of the opacity calculation has some effect on the structure, including a relatively substantial change in the sound speed.

Fig. 26
figure26

Model changes at fixed fractional radius, between Model [OPAL96] which uses the OPAL96 opacities and Model S, where the OPAL92 tables were used, in the sense (Model [OPAL96])–(Model S). Line styles are as defined in Fig. 25

As noted by, for example, Tripathy and Christensen-Dalsgaard (1998) and Vinyoles et al. (2017) responses to localized opacity changes such as shown in Fig. 25 define ‘opacity kernels’ that can be used to reconstruct the effects of more general opacity changes. An example is illustrated in Fig. 27. Here the top panel shows a fit to the difference between the OPAL96 and OPAL92 tables in the radiative region, based on localized opacity changes of the form in Eq. (47) on a dense grid in \(\log T_\kappa \). Applying the resulting amplitudes to the corresponding model differences yields the red curves in the bottom panel, which are in excellent agreement with the direct differences between the OPAL96 and OPAL92 models, as illustrated in Fig. 26. The changes in \(c^2\) and \(\rho \) are dominated by the substantial negative opacity difference at relatively low temperature, yielding a negative \(\delta _r \ln c^2\) just below, and a negative \(\delta _r \ln \rho \) within, the convection zone. As noted above the change in X, on the other hand, is insensitive to the opacity change in the outer parts of the radiative region, and hence the positive \(\delta \ln \kappa \) in the deeper regions results in a negative \(\delta _r X\).

Fig. 27
figure27

Top panel: The solid curve shows logarithmic differences between the OPAL96 and the OPAL92 opacity, in the sense (OPAL96)–(OPAL92), at fixed \(\rho \), T and composition in Model [OPAL96]. The dashed curve shows a fit of functions of the form in Eq. (47), with \(\varDelta _\kappa = 0.02\) and on a grid in \(\log T_\kappa \) between 7.2 and 6.2 with a step of 0.01. Bottom panel: differences \(\delta _r \ln c^2\) (solid curves), \(\delta _r \ln \rho \) (long-dashed curve) and \(\delta _r X\) (double-dot-dashed curve). The black curves show results from Fig. 26, whereas the red curves show reconstructions based on ‘opacity kernels’ such as shown in Fig. 25, using the fit shown in the top panel

The effects of changing the atmospheric opacity are illustrated in Fig. 28, comparing the more recent tables of Ferguson et al. (2005) with the Kurucz (1991) tables used in the computation of Model S. There are significant changes in pressure and density in the atmosphere, reflecting the integration of atmospheric structure at the given temperature structure (cf. Sect. 2.4, in particular Eqs. 2830). However, as discussed by Christensen-Dalsgaard and Thompson (1997) the effects of such superficial changes in calibrated solar models are very strongly confined to the near-surface layers; the differences in the bulk of the convection zone and in the radiative interior are minute.Footnote 44

Fig. 28
figure28

Model changes at fixed fractional radius, between a model computed using the Ferguson et al. (2005) low-temperature opacities and Model [OPAL96], which used the Kurucz (1991) tables, in the sense (Model [Surf. opac.])–(Model [OPAL96]); in both cases the OPAL96 tables were used in the deeper parts of the model. Line styles are defined in the top panel

Relative to the Grevesse and Noels (1993) composition used in Model S a modest revision was proposed by Grevesse and Sauval (1998); the compositions are compared in Table 4 in Sect. 6.1 below. This composition has seen extensive use in solar modelling. The effects of this change on the model structure are illustrated in Fig. 29, using for both compositions the OPAL96 opacity tables. There is evidently some change, at a level that is significant compared with the helioseismic results in the sound speed, as well as a modest change in the hydrogen abundance required for luminosity calibration. In particular, the 10% change in the oxygen abundance (cf. Table 4) and the general decrease in the heavy-element abundance (cf. Table 3) cause a decrease in the opacity of up to 4% just below the convection zone, leading the a significant decrease in the sound speed in the outer parts of the radiative region, as shown in Fig. 29. As discussed in Sect. 6.1 the much greater revision since 2000 of the determination of the solar surface composition has had very substantial effects on solar models.

Fig. 29
figure29

Model changes at fixed fractional radius, between Model [GS98] using the Grevesse and Sauval (1998) composition and Model [Surf. opac.] which used the Grevesse and Noels (1993) composition (see Table 4), in the sense (Model [GS98])–(Model [Surf. opac.]); in both cases the Ferguson et al. (2005) atmospheric and the OPAL96 interior tables were used. Line styles are as defined in Fig. 28

An indication of the effects of the uncertainties in the opacity computations may be obtained by comparing the use of the OPAL tables with the results of the independent OP project (e.g., Seaton et al. 1994; Badnell et al. 2005; Seaton 2005); the differences between the tables are illustrated in Fig. 6 (note that this shows OPAL–OP). In Fig. 30 models computed with the OP and OPAL tables are compared, in both cases using the Grevesse and Sauval (1998) composition. The effect is clearly substantial, with an increase in the sound speed in the bulk of the radiative interior and in the hydrogen abundance resulting from the luminosity calibration. The model differences can at least qualitatively be understood from the opacity kernels discussed above. The differences in sound speed, pressure and density are probably dominated by the positive table differences at temperatures just below the convection zone, while the change in the hydrogen abundance is dominated by the negative table differences in the deeper parts of the model. Other comparisons of different opacity calculations were carried out, for example, by Neuforge-Verheecke et al. (2001b), who compared OPAL and the Los Alamos LEDCOP tables, and Le Pennec et al. (2015b), comparing OPAL and the recent OPAS tables (Blancard et al. 2012; Mondet et al. 2015) developed at CEA, France.

Fig. 30
figure30

Model changes at fixed fractional radius, between Model [OP05] using the OP05 opacity tables (e.g., Seaton 2005) and Model [GS98] using the OPAL96 tables, in the sense (Model [OP05])–(Model [GS98]); in both cases the GS98 composition and the Ferguson et al. (2005) low-temperature opacities were used. Line styles are as defined in Fig. 28

As discussed in Sect. 2.5 there is considerable uncertainty in the treatment of convection in the strongly super-adiabatic region just below the photosphere (see Fig. 12). In calibrated solar models, however, this has little effect on the structure of the bulk of the model. To illustrate this Fig. 31 shows differences between a model computed using the Canuto and Mazzitelli (1991) treatment, as implemented by Monteiro et al. (1996), and Model S. There are substantial differences in the near-surface region, but these are very strongly confined, with the differences being extremely small in the lower parts of the convection zone and the radiative interior (see also Christensen-Dalsgaard and Thompson 1997). This effect is similar to the effect of modifying the atmospheric opacity, shown in Fig. 28. As illustrated by the solid blue line, the difference in squared sound speed at fixed mass fraction is much more strongly confined near the surface than the difference at fixed fractional radius. It was argued by Christensen-Dalsgaard and Thompson (1997) that, consequently, \(\delta _q \ln c^2\) provides a better representation of the effects of the near-surface modification on the oscillation frequencies. In fact, model differences such as these or those shown in Fig. 28 provide a model for the near-surface errors in traditional structure and oscillation modelling which have an important effect on helio- and asteroseismic investigation. To illustrate this, Fig. 37 below shows frequency differences between the models illustrated in Fig. 31.

Fig. 31
figure31

Model changes at fixed fractional radius, between Model [CM] emulating the Canuto and Mazzitelli (1991) treatment of near-surface convection and Model S, in the sense (Model [CM])–(Model S). Line styles are as defined in Fig. 28, with the addition of the solid blue line which shows the difference \(\delta _q \ln c^2\) of squared sound speed at fixed mass fraction q

The effects of the updates to the nuclear reaction parameters since Model S are illustrated in Fig. 32. Panel (a) is based on a model computed with the Adelberger et al. (2011) parameters, while in panel (b) the NACRE rates (Angulo et al. 1999) reaction rates, with the Formicola et al. (2004) update of the \({{}^{14}\mathrm{N}}\) rate, were used. In both cases the dominant change to the overall reaction rate was at the highest temperatures and is closely related to updated quantities for the CNO reactions; at fixed conditions the energy generation decreased by 5–8% relative to the formulation used in Model S. This is directly reflected in the higher hydrogen abundance (see also Table 3) and hence higher sound speed in the core, in both cases. Calibration to fixed luminosity caused modest changes in the structure in the other parts of the models. It should be noticed that while the differences in \(\epsilon \) at fixed \(\rho \), T and composition for the Adelberger et al. (2011) rates are largely confined to the region where \(\log T \ge 7.1\), the differences in the NACRE rates extend more broadly, leading to the substantially larger model differences in the NACRE case (Fig. 32b).

Fig. 32
figure32

Model changes at fixed fractional radius, corresponding to changes in the nuclear reaction parameters, compared with Model S which used parameters largely from Bahcall and Pinsonneault (1995). a Differences for Model [Adelb11], using the Adelberger et al. (2011) parameters, in the sense (Model [Adelb11])–(Model S), and b shows differences for Model [NACRE] using the Angulo et al. (1999) (NACRE) parameters, with the reaction \({{}^{14}\mathrm{N}}+ {{}^{1}\mathrm{H}}\) updated by Formicola et al. (2004), in the sense (Model [NACRE]–(Model S). Line styles are as defined in Fig. 28

A potential simplification of the calculation is to assume that \({{}^{3}\mathrm{He}}\) is in nuclear equilibrium. The region where this is satisfied approximately corresponds to the rising part of the \({{}^{3}\mathrm{He}}\) abundance shown in Fig. 7 and hence in fact covers most of the region of nuclear energy generation in the present Sun. However, the change in the hydrogen abundance over solar evolution does depend on the details of the nuclear reactions. As illustrated in Fig. 33 assuming nuclear equilibrium of \({{}^{3}\mathrm{He}}\) throughout the evolution indeed generally has a minute effect on the resulting model of the present Sun. The peak in \(\delta _r X\) at \(r/R \approx 0.27\) corresponds closely to the peak in the \({{}^{3}\mathrm{He}}\) abundance (cf. Fig. 7 and probably reflects the local conversion of hydrogen into \({{}^{3}\mathrm{He}}\).

Fig. 33
figure33

Model changes at fixed fractional radius, between Model [\( {}^3\mathrm{He} \) eql.] where \({{}^{3}\mathrm{He}}\) is assumed to be in nuclear equilibrium and Model S, in the sense (Model [\( {}^3\mathrm{He} \) eql.])–(Model S). Line styles are defined in the figure

As mentioned in Sect. 2.3.3 there has been some discussion about the validity of the classical Salpeter (1954) model of static screening of nuclear reactions, with dynamical simulations indicating absence of screening (Mussack and Däppen 2011). The effects of switching off all screening of nuclear reactions are illustrated in Fig. 34. At fixed conditions corresponding to Model S this results in a reduction in the nuclear energy-generation rate of up to 9% near the centre, where the CNO cycle plays some role (cf. Fig. 8), and around 5% further out, where the PP chains dominate. To achieve luminosity calibration this is compensated by increases in temperature, hydrogen abundance and density, the latter increase requiring a decrease in density in the outer parts of the model to conserve the total mass (cf. Eq. 53). The effects show some similarity to the effects of the revision of nuclear parameters (Fig. 32), probably reflecting also here the larger reduction in the rates of the more temperature-sensitive reactions, but the changes are clearly of a much larger magnitude. Indeed, Weiss et al. (2001) pointed out that the resulting model is inconsistent with the constraints provided by the helioseismically determined sound speed (cf. Sect. 5.1.2; see also Christensen-Dalsgaard and Houdek 2010).

Fig. 34
figure34

Model changes at fixed fractional radius, between Model [No el.scrn] where electron screening is switched off and Model S, in the sense (Model [No el.scrn])–(Model S). Line styles are as defined in Fig. 33

To illustrate the sensitivity of the models to the detailed treatment of diffusion and settling Fig. 35 shows the effect of increasing \(D_i\) (cf. Eq. 6) by a factor 1.2 (panel a) or increasing both \(D_i\) and \(V_i\) by this factor (panel b). In the former case the effects are small, the dominant changes being confined to the core where the increased diffusion partly smoothes the hydrogen profile, leading to an increase in the hydrogen abundance, with a corresponding increase in the sound speed. There are additional even smaller changes associated with the gradient in hydrogen abundance caused by settling just below the convection zone. When also the settling velocity is increased the changes are more substantial, including a significant increase in the hydrogen abundance in the convection zone and a noticeable increase in the sound speed below the convection zone; note also the near-surface sound-speed changes, of similar shape but opposite sign to the effects of neglecting diffusion and settling (see Fig. 36 below) and, as in that case, reflecting the thermodynamic response to the change in the helium abundance.

Fig. 35
figure35

Model changes at fixed fractional radius, resulting from changes to the diffusion and settling coefficients, compared with Model S. a Differences for Model [Dc] where the diffusion coefficient \(D_i\) (cf. Eq. 6) was increased by a factor 1.2, in the sense (Model [Dc])–(Model S). b Differences for Model [DVc] where both \(D_i\) and the settling velocity \(V_i\) were increased by a factor 1.2, in the sense (Model [DVc])–(Model S). Line styles are as defined in Fig. 33

Fig. 36
figure36

Model changes at fixed fractional radius, comparing Model [No diff.] neglecting diffusion and settling with Model S, in the sense (Model [No diff.])–(Model S). Line styles are as defined in Fig. 33; in addition, in the right-hand expanded view of the outer helium and hydrogen ionization zones the green dotted curve shows \(\delta _r \ln \varGamma _1\)

Finally, it should be recalled that early ‘standard solar models’ did not include effects of diffusion and settling. It was shown by Christensen-Dalsgaard et al. (1993) that including just diffusion and settling of helium led to a substantial improvement in the comparison between the model and helioseismic inferences of sound speed, and hence more recent solar models, such as Model S, include full treatment of diffusion. To illustrate this effect Fig. 36 compares a model ignoring diffusion but otherwise corresponding to Model S, including the calibration, with Model S. It is evident that the change in the hydrogen abundance (which obviously to a large extent reflects the Model S hydrogen profile illustrated in Fig. 18) has a substantial effect on the sound speed, hence affecting the comparison with the helioseismic inference. There are more subtle effects on the sound speed near the surface that in part arises from the change in \(\varGamma _1\) caused by the change of the helium abundance in the helium ionization zones, and which affects the frequencies of acoustic modes. This effect illustrates the potential for helioseismic determination of the solar envelope helium abundance (see Sect. 5.1.3).

Tests of solar models

The models discussed so far have explicitly been computed to match the ‘classical’ observed quantities of the Sun: the initial composition \((X_0, Z_0)\) has been chosen to match the solar luminosity and present surface composition and the choice of mixing length has been made to match an assumed solar radius, at the assumed present age of the Sun. Since the model has thus been adjusted to match the observed \(L_{\mathrm{s}}\), R and \(Z_{\mathrm{s}}/X_{\mathrm{s}}\) these quantities provide no independent test of the calculation, beyond the feeble constraint that apparently reasonable values of the required parameters can be found which match the observables.

As discussed in the introduction, very detailed independent testing of the model computation has become possible through helioseismology, by means of extensive observations of solar oscillations. Additional information relevant to the structure of the solar core results from the detection of neutrinos originating from the nuclear reactions (cf. Eq. 22). Finally, I briefly consider the surface abundances of light elements or isotopes which provide constraints on mixing processes in the solar interior.

Helioseismic tests of solar structure

Detailed reviews of the techniques and results of helioseismology have been provided by Christensen-Dalsgaard (2002), Basu and Antia (2008) and Aerts et al. (2010); An extensive review of solar oscillations and helioseismology was provided by Basu (2016) in Living Reviews of Solar Physics. A perhaps broader view, emphasizing also the limitations in the present results, was provided by Gough (2013b). None the less, it is appropriate here to provide a brief overview of the techniques of helioseismology and to summarize the results on the solar interior.

Properties of solar oscillations

Oscillations of the Sun are characterized by the degree l and azimuthal order m,Footnote 45 with \(|m| \le l\), of the spherical harmonic \(Y_l^m(\theta , \phi )\) describing the mode, where \(\theta \) is co-latitude and \(\phi \) is longitude, and by its radial order n. The degree provides a measure of the horizontal wave number \(k_{\mathrm{h}}\):

$$\begin{aligned} k_{\mathrm{h}}= {\sqrt{l(l+1)} \over r}, \end{aligned}$$
(54)

at distance r from the solar centre. Thus, except for radial modes (with \(l = 0\)), the average horizontal wavelength on the solar surface is \(\lambda _{\mathrm{h,s}} \simeq 2 \pi R/l\). The azimuthal order measures the number of nodal lines crossing the equator. The observed cyclic oscillation frequencies \(\nu \), between roughly 1 and 5 mHz, correspond to modes that predominantly have the character of standing acoustic waves, or p modes, and, at high degree, surface gravity waves, or f modes. In the case of the p modes, the frequencies are predominantly determined by the internal sound speed c, with

$$\begin{aligned} c^2 = {\varGamma _1 p \over \rho } \simeq {\varGamma _1 k_{\mathrm{B}}T \over \mu m_{\mathrm{u}}}, \end{aligned}$$
(55)

the latter expression assuming the ideal gas law (cf. Eq. 12).

The f-mode frequencies are to a good approximation given by the deep-layer approximation for surface gravity waves, determined by the surface gravitational acceleration. Thus to leading order these modes provide little information about the structure of the solar interior, although a correction term, essentially reflecting the variation in the appropriate gravitational acceleration with mode properties, provides some sensitivity to the near-surface density profile (Gough 1993; Chitre et al. 1998). The dependence on surface gravity has been used to determine, on the basis of f-mode frequencies, the ‘seismic solar radius’ (Schou et al. 1997) and its variation with solar cycle (e.g., Kosovichev and Rozelot 2018a).

Rotation (or other departures from spherical symmetry) induce a dependence of the frequencies on the azimuthal order m. To leading order the effect of rotation simply corresponds to the advection of the oscillation patterns by the angular velocity as averaged over the region of the Sun sampled by a given mode.

From the dispersion relation for acoustic waves, and Eq. (54), it is straightforward to show that the modes are oscillatory as a function of r in the region of the Sun which lies outside an inner turning point, at distance \(r = r_{\mathrm{t}}\) from the centre satisfying

$$\begin{aligned} {c(r_{\mathrm{t}}) \over r_{\mathrm{t}}} = {\omega \over \sqrt{l(l+1)}}, \end{aligned}$$
(56)

and evanescent interior to this point; here \(\omega = 2 \pi \nu \) is the angular frequency of the mode. Since the sound speed generally increases with decreasing r, the turning point is close to the solar centre for very low degrees at the observed frequencies, the modes becoming increasingly confined near the surface with increasing degree. From a physical point of view this behaviour of the modes corresponds to total internal reflection, owing to the increase in the sound speed with depth, of sound waves corresponding to the given degree: the waves travel horizontally at the inner turning point. With increasing degree the initial direction of the waves at the solar surface is more strongly inclined from the vertical and the turning point is reached closer to the surface.

The frequency of a given acoustic mode reflects predominantly the structure outside the turning point. The observed modes have degree from 0 to more than 1000, and hence turning points varying from very near the solar centre to immediately below the photosphere. This variation in sensitivity allows the determination of the structure with high resolution in the radial direction. Very crudely, the high-degree modes give information about the near-surface region of the Sun. Given this, modes of slightly lower degree can be used to determine the structure at slightly greater depth, and so on, the analysis continuing to the solar core. Similarly, modes of differing azimuthal order have different extent in latitude, those with \(|m| \simeq l\) being confined near the equator and modes with low |m| extending over all latitudes; thus observation of frequencies as a function of m over a range of degrees allows the determination of, for example, the angular velocity as a function of both latitude and distance from the centre.

For completeness I note that there have also been claims of observed solar oscillations with much longer periods. Such modes would be internal gravity waves, or g modes, with greater sensitivity to conditions in the solar core than the acoustic modes.

With the exception of the region just below the surface, and the atmosphere, solar oscillations can be treated as adiabatic to a very high precision. This approximation is generally used in computations of solar oscillation frequencies. However, nonadiabatic effects in the oscillations are undoubtedly important in the near-surface region, as are the processes that excite the modes. The physical treatment of these effects, involving the interaction between convection and the oscillations, is uncertain, and so therefore are their effects on the oscillation frequencies (for a review, see Houdek and Dupret 2015). Also, the structure of the near-surface region of the model is affected by the uncertain effects of convection, including the general neglect of turbulent pressure (cf. Sect. 2.4).

Such inadequacies in modelling the structure and the oscillations very near the solar surface appear to dominate the differences between the observed frequencies and frequencies of solar models (e.g., Christensen-Dalsgaard 1984a; Dziembowski et al. 1988; Christensen-Dalsgaard et al. 1996). Fortunately, the effect of these near-surface uncertainties on the frequencies in many cases has a relatively simple dependence on the mode frequency and degree. This follows from the fact that the physics of the modes, except at very high degree, in the near-surface layers is insensitive to the degree and so, therefore, is the direct effect of these layers on the oscillation frequencies. This, however, must be corrected for the fact that according to Eq. (56) higher-degree modes involve a smaller fraction of the star and hence are easier to perturb. A quantitative measure of this effect is provided by the mode inertia

$$\begin{aligned} E = {\int _V \rho |{\varvec{\delta }}{\varvec{r}}|^2 \mathrm{d}V \over M |{\varvec{\delta }}{\varvec{r}}|_{\mathrm{phot}}^2 }, \end{aligned}$$
(57)

where the integral is over the volume of the star, \({\varvec{\delta }}{\varvec{r}}\) is the displacement vector, and \(|{\varvec{\delta }}{\varvec{r}}|_{\mathrm{phot}}\) is its norm at the photosphere; it may be shown that the frequency shift from a near-surface modification is proportional to \(E^{-1}\) (e.g., Aerts et al. 2010). It is convenient to take out the frequency dependence of the inertia by considering, instead of E, \(Q = E/{\bar{E}}_0(\omega )\), where \({\bar{E}}_0(\omega )\) is the inertia of a radial mode, interpolated to the frequency \(\omega \) of the mode considered, effectively renormalizing the surface effect to the effect on radial modes. The resulting functional form of the effect on the frequencies of the near-surface uncertainties is reflected by the last term in Eq. (61) below (e.g. Christensen-Dalsgaard 1988b; Aerts et al. 2010). Given the very extensive data available on solar oscillations this property of the frequency differences caused by the near-surface effects to a large extent allows their consequences to be suppressed in the analysis of the observed oscillation frequencies, leading to reliable inferences of the internal structure (e.g., Dziembowski et al. 1990; Däppen et al. 1991; Gough 1996b). For distant stars, however, where only low-degree modes are observed, the surface errors represent a significant source of uncertainty in the analysis of the oscillation frequencies. Various procedures have been developed to suppress these effects in fits to the observed frequencies (e.g. Kjeldsen et al. 2008; Ball and Gizon 2014), or, alternatively, the fits can be based on frequency combinations defined to be largely insensitive to them (Roxburgh and Vorontsov 2003; Otí Floranes et al. 2005).

How errors in the near-surface region affect the oscillation frequencies can be illustrated by the model differences shown in Fig. 31, between a model using the Canuto and Mazzitelli (1991) treatment of convection and Model S which used the Böhm-Vitense (1958) mixing-length treatment. Frequency differences between these two models are shown in Fig. 37. To compensate for the fact that with increasing degree the modes involve a smaller part of the Sun (cf. Eq. 56) the differences have been scaled by the normalized \(Q_{nl}\), as discussed above. The figure clearly shows that with this scaling the frequency differences are indeed largely independent of the degree.

Fig. 37
figure37

Frequency differences for modes of degree \(l \le 100\), scaled by the inertia ratio \(Q_{nl}\), between a model emulating the Canuto and Mazzitelli (1991) treatment of near-surface convection and Model S, in the sense (modified model)–Model S. The corresponding model differences are shown in Fig. 31

Clearly an important goal is to understand the structure and oscillation dynamics in the near-surface layers better and eventually model them consistently in the calculation of the oscillation frequencies; in this context the otherwise strongly constrained solar case will serve as an important test. A key aspect is the treatment of convection in the equilibrium model and the oscillations (see also Sect. 2.5). Schlattl et al. (1997) used a detailed atmospheric model and modelled the outer layers of the convection zone by a variable mixing-length parameter matched to a two-dimensional hydrodynamical simulation of convection; they noted that the resulting model matched the observed solar oscillation frequencies better than did the normal model. A similar improvement of the frequencies was obtained by Rosenthal et al. (1995, 1999) and Robinson et al. (2003) by including suitable averages of convection simulations in the modelling (see also Sect. 2.5). Sonoi et al. (2015) and Ball et al. (2016) studied the effect on stellar oscillation frequencies of using averaged simulations as the outer parts of stellar models, for a range of stellar parameters. Magic and Weiss (2016) also considered the patching of averaged simulations to solar models and in addition devised corrections to the depth scale and density in normal one-dimensional models that mimicked the effects on the frequencies of the patching. In addition to normal simulations they carried out simulations with magnetic fields, representing more active areas of the solar surface, determining the effect of the resulting change in the structure of the solar layers on the oscillation frequencies, although without considering the direct effect of the field on the oscillations. The analysis was extended to a broad range of stellar parameters, ranging from the main sequence to the red-giant branch, by Trampedach et al. (2017), who emphasized the importance of both the expansion of the near-photospheric layers by the effect of turbulent pressure and the so-called ‘convective back-warming’, i.e., the effects of the convective fluctuations on the strongly temperature-sensitive opacity. In similar analyses, Sonoi et al. (2017) included also some effects of the perturbation to the turbulent pressure, based on a time-dependent convection formulation restricted to adiabatic oscillations, while Manchon et al. (2018) emphasized the sensitivity of the near-surface frequency shifts to the metallicity of the stars.

An equally important contribution to the deficiencies in the model frequencies is the physics of the oscillations in the near-surface region. Here the energetics of the oscillations, including the perturbations to the convective flux, must be taken into account in fully nonadiabatic calculations, and the perturbation to the turbulent pressure has a significant effect on the frequencies and the damping of the modes. To treat these effects requires a time-dependent modelling of convection (see Houdek and Dupret 2015, for a review). Time-dependent versions of the mixing-length theory were established by Unno (1967) and Gough (1977b) and have been further developed since then. With a few exceptions the nonadiabatic calculations show that the modes are intrinsically damped; they are excited to the observed amplitudes by stochastic forcing from convection, as confirmed by analysis of the observed amplitude distribution (Chaplin et al. 1997). Consequently the observed linewidths in the frequency power spectra provide a measure of the damping rates of the modes, allowing calibration of parameters in the convection modelling such that the computed damping rates match the observed linewidths. Combining results from hydrodynamical simulations of the outer layers with nonadiabatic computations using a non-local time-dependent convection treatment including also the turbulent-pressure perturbation, Houdek et al. (2017), as illustrated in Fig. 38, obtained a much improved fit to the solar observed frequencies, at the same time showing a reasonable fit to the observed damping rates. Analyses of intrinsic or induced oscillations in hydrodynamical simulations are providing further insight into the physics of the interaction between convection and the oscillations (Belkacem et al. 2019; Zhou et al. 2019), which may be used further to improve the simplified treatments based on mixing-length formulations. In an interesting analysis, Schou and Birch (2020) determined the frequency correction caused by the effect on the oscillations of convection dynamics by matching eigenfunctions in standard oscillation calculations to eigenfunctions resulting from the convection simulations.

Fig. 38
figure38

Image reproduced with permission from Houdek et al. (2017), copyright by the authors.)

Differences, reduced to the case of radial modes (with \(l = 0\)), between observed and modelled solar oscillation frequencies against frequency, in the sense (Sun)–(Model). The dot-dashed curve uses adiabatic frequencies for a model essentially corresponding to Model S (Christensen-Dalsgaard et al. 1996, see Sect. 4.1). The solid curve is based on a model where the outermost layers were replaced by a suitable average of a three-dimensional radiative-hydrodynamic simulation of convection. In addition, the frequencies were obtained from nonadiabatic calculations taking the interaction with convection, including turbulent pressure, into account.

Investigations of the structure and physics of the solar interior

Very extensive helioseismic data have been acquired over the past decades, from groundbased networks of observatories and from Space (for further details, see for example Christensen-Dalsgaard 2002; Aerts et al. 2010). In most cases observations of radial velocity are carried out, based on the Doppler effect, extending over months or years to achieve sufficient frequency resolution, reduce the background noise and follow possible temporal variations in the Sun. Spatially resolved observations are analysed to isolate modes corresponding to a few combinations of (lm).Footnote 46 From the resulting time series power spectra are constructed through Fourier transform, and the frequencies of solar oscillations are determined from the position of the peaks in the power spectra. Low-degree modes have been studied in great detail through observations in disk-integrated light, observing the Sun as a star, from the BiSON (Chaplin et al. 1996; Hale et al. 2016) and IRIS (Fossat 1991) networks, and with the GOLF instrument on the SOHO spacecraft (Gabriel et al. 1997). Modes of degree up to around 100 were studied for an extended period of time with the LOWL instrument (Tomczyk et al. 1995), extended to the two-station ECHO network, which has now stopped operation. Also, the six-station GONG network (Harvey et al. 1996) has yielded nearly continuous data for modes of degree up to around 150 since late 1995, whereas modes including even higher degrees were studied with the SOI/MDI instrument on SOHO (Scherrer et al. 1995; Rhodes et al. 1997). Since May 2010 these high-resolution observations have been taken over by the HMI instrument on the Solar Dynamics Observatory (Hoeksema et al. 2018), with regular MDI observations ending in April 2011. Detailed analyses of the BiSON low-degree observations were carried out by Broomhall et al. (2009) and Davies et al. (2014), while Larson and Schou (2015, 2018) analysed the MDI and HMI observations for modes of degree up to \(l \approx 300\). At even higher degree the modes lose their individual nature owing to the decreasing separation between adjacent modes and the increasing damping rates; thus the analysis of these modes is affected by systematic errors and interference between the modes (Korzennik et al. 2004; Rabello-Soares et al. 2008). Here special techniques are required for the frequency determination as discussed, e.g., by Reiter et al. (2015) and Reiter et al. (2020), who analysed a 66-day high-resolution set of MDI observations. It should be noticed that, according to Eq. (56), these high-degree modes have their lower turning point quite close to the surface; this makes them particularly interesting for the study of the near-surface layers (e.g., Di Mauro et al. 2002), where thermodynamic effects associated with helium and hydrogen ionization become relevant, and where, as discussed above, the properties of the structure and the oscillations are somewhat uncertain. Very extensive high-resolution data are being obtained with HMI, but these have apparently so far not been analysed to determine properties of high-degree modes.

Owing to their great potential for helioseismic investigations the g modes have been the target of major observational efforts. García et al. (2007) inferred the presence of g modes with the expected nearly uniform period spacing from periodicities in the power spectrum of GOLF observations. However, a review by Appourchaux et al. (2010) found that the attempts up to that point to detect g modes were inconclusive. Recently, Fossat et al. (2017) claimed evidence for g modes of degree \(l = 1\) and 2 through an ingenious and complex analysis of the spacing between solar acoustic low-degree modes observed with GOLF. In a follow-up study Fossat and Schmider (2018) extended this to modes of degree 3 and 4. Interestingly, the results indicated a rapid rotation of the solar core, possibly at variance with the results obtained from the analysis of solar acoustic modes (see Fig. 44 below). However, Schunker et al. (2018), repeating the analysis, found that the results were very sensitive to the details of the fits, including the assumed starting time of the time series of observations. A similar sensitivity to the details of the analysis was found by Appourchaux and Corbard (2019), analysing a recalibrated version of the GOLF data (Appourchaux et al. 2018); on this basis they concluded that the results of Fossat et al. (2017) and Fossat and Schmider (2018) were artefacts of the methodology. Also, the physical effects that might introduce the g-mode signal in the acoustic-mode properties are so far unclear. Indeed, although already Kennedy et al. (1993) proposed this type of analysis they noted that the coupling between the modes is such that to leading order the p-mode frequencies are insensitive to g modes of odd degree (see also Gough 1993), in conflict with the inferences of Fossat et al. This was analysed in more detail by Böning et al. (2019) and Scherrer and Gough (2019). Furthermore, Scherrer and Gough confirmed and extended the results of Schunker et al. (2018) and tried, and failed, to find a similar signal in the MDI and HMI data; they also noted that the inferred rapid rotation of the solar core is difficult to reconcile with the constraints obtained from extensive analyses of well-observed solar acoustic modes (see Sect. 5.1.4). Thus the evidence for solar g modes remains uncertain, and I shall not consider them further in this review.

From Eq. (56) it follows that acoustic modes of low degree penetrate to the stellar core. This is particularly important for investigations of distant stars, where only low-degree modes are observed (see Sect. 7), but low-degree acoustic modes have also been important for the study of the solar core, not least in connection with the solar neutrino problem (e.g., Elsworth et al. 1990, see also Sect. 5.2). The cyclic frequencies \(\nu _{nl} = \omega _{nl}/2\pi \) of these modes satisfy the asymptotic relation (Tassoul 1980; Gough 1993)

$$\begin{aligned} \nu _{nl} \approx \varDelta \nu \left( n + {l \over 2} + \varepsilon \right) - d_{nl}, \end{aligned}$$
(58)

where the large frequency separation

$$\begin{aligned} \varDelta \nu = \left( 2 \int _0^R {\mathrm{d}r \over c} \right) ^{-1} \end{aligned}$$
(59)

is the inverse of the acoustic travel time across a stellar diameter and \(\varepsilon \) is a frequency-dependent phase related to the near-surface layers. Thus to leading order the frequencies are uniformly spaced in radial order, with degeneracy between modes with the same \(n + l/2\). This degeneracy is lifted by the small correction term \(d_{nl}\), leading to the small frequency separations

$$\begin{aligned} \delta \nu _{nl} = \nu _{nl} - \nu _{n-1\,l+2} \simeq - (4 l + 6) {\varDelta \nu \over 4 \pi ^2 \nu _{nl}} \int _0^R {\mathrm{d}c \over \mathrm{d}r} {\mathrm{d}r \over r}. \end{aligned}$$
(60)

Since the integral is strongly weighted towards the stellar centre, \(\delta \nu _{nl}\) is a useful diagnostic for the properties of the stellar core, including stellar age (e.g., Christensen-Dalsgaard 1984b, 1988a; Ulrich 1986, see also Sect. 6.2).

The extensive sets of observed solar oscillation frequencies make possible detailed inferences of the properties of solar structure, through inverse analyses of the observations. Reviews of such inversion techniques were given by, for example, Gough and Thompson (1991), Gough (1996b), Basu and Antia (2008) and Basu (2016). Assuming adiabatic oscillations, the frequencies are determined by the dependence of pressure, density and gravity on r, as well as on \(\varGamma _1\) which relates the perturbations to pressure and density. However, given that the solar model satisfies the equations of hydrostatic support and mass, Eqs. (1) and (2), the mass m and p can be computed once \(\rho (r)\) is specified. It follows that the adiabatic oscillation frequencies are fully defined if \((\rho (r), \varGamma _1(r))\) is specified. Alternatively equivalent pairs can be used; given that the frequencies of acoustic modes are predominantly determined by the sound speed, convenient choices are \((c^2, \rho )\) or \((u, \varGamma _1)\), \(u = p/\rho \) being the squared isothermal sound speed.

It was demonstrated by Gough (1984a) that a simple asymptotic relation for the frequencies, first found by Duvall (