Inversion of the radiative transfer equation for polarized light

Since the early 1970s, inversion techniques have become the most useful tool for inferring the magnetic, dynamic, and thermodynamic properties of the solar atmosphere. The intrinsic model dependence makes it necessary to formulate specific means that include the physics in a properly quantitative way. The core of this physics lies in the radiative transfer equation (RTE), where the properties of the atmosphere are assumed to be known while the unknowns are the four Stokes profiles. The solution of the (differential) RTE is known as the direct or forward problem. From an observational point of view, the problem is rather the opposite: the data are made up of the observed Stokes profiles and the unknowns are the solar physical quantities. Inverting the RTE is therefore mandatory. Indeed, the formal solution of this equation can be considered an integral equation. The solution of such an integral equation is called the inverse problem. Inversion techniques are automated codes aimed at solving the inverse problem. The foundations of inversion techniques are critically revisited with an emphasis on making explicit the many assumptions underlying each of them. An incremental complexity procedure is advised for the implementation in practice. Coarse details of the profiles or coarsely sam- pled profiles should be reproduced first with simple model atmospheres (with, for example, a few physical quantities that are constant with optical depth). If the Stokes profiles are well sampled and differences between synthetic and observed ones are greater than the noise, then the inversion should proceed by using more complex models (that is, models where physical quantities vary with depth or, eventually, with more than one component). Significant im- provements are expected as well from the use of new inversion techniques that take the spatial degradation by the instruments into account.


Introduction
Unlike other branches of physics, astrophysics cannot apply the third pillar of the scientific method, experimentation. After observing nature and conjecturing laws that govern its behavior, astronomers cannot carry out experiments that confirm or falsify the theory. Experimentation is then substituted by new observations conducted to check the theoretical predictions. The intrinsic inability for directly measuring the celestial objects adds a special difficulty to the astrophysical tasks. We do not have thermometers, weighing scales, tachometers, magnetometers that can directly gauge the physical conditions in the object. Rather we have to be content with indirect evidence or inferences obtained from the only real astrophysical measurements, namely those related to light. The intensity and polarization properties for visible light, the associated electric field for radio frequencies, or the energy or momentum of high energy photons, as functions of space, wavelength, and time, can be fully quantified with errors that are directly related to the accuracy of the instruments. 1 From these real measurements, the observational astronomer must deduce or infer the physical quantities that characterize the object with uncertainties that depend on both the experimental errors and the assumptions that allow him/her to translate light-derived quantities into the object quantities. Observational astrophysics could hence probably be defined as the art of inferring the physical quantities of heavenly bodies from real measurements of the light received from them.
Somehow, these astrophysical tasks can be mathematically seen as a mapping between two spaces, namely the space of observables and that of the object's physical quantities. The success of the astronomer then depends on his/her ability (the art ) to characterize not only the mapping but the two spaces. On the observable side, what really matters is the specific choice of measurable parameters and how well they are measured; that is, how many light parameters are obtained (the signal) and which are the measurement errors (the noise). On the object's physical condition side, what is substantive is the selection of quantities to be inferred. Of course, the finer the -affordable-detail in describing any of the two spaces, the better. The keyword is affordable because infinite resolution does not exist in the real world: a compromise is always in order between the number of available observables and the number of inferred physical quantities. The representation of both spaces therefore needs approximations that constrain the sub-spaces to be explored and how they are described: which Stokes parameters, with which wavelength and time sampling, and with which instrument profile and resolution on the one hand, and, on the other, which quantities and how they are assumed to vary on the object with time and space. Concerning the mapping, this should represent the physics that generates the observables from the given physical conditions in the object and thus illustrates the dependence of the observables on given physical quantities. Understanding this physics is crucial if the researcher is to select observables that are as "orthogonal" as possible; that is, that depend mostly on one physical quantity and not on the others. Certainly, the physics mapping needs approximations as well. These approximations depend a great deal on the observables and on the object's physical quantities; for example, the assumptions cannot be the same if you have fully sampled Stokes profiles or just a few wavelength samples; different hypotheses apply for physical quantities that do or do not vary with depth in the atmosphere, or that are expected to present a given range of magnitudes. Therefore, mappings may include (often over-simplistic) one-dimensional calibration curves between a given observable parameter and a given physical quantity, or complicated multidimensional relationships between observables and quantities that require the definition of a metric or distance in at least one of the two spaces.
Even in the simplest situations, the relationship between observables and quantities does not have to be linear and may depend on the specific sub-space of the physical parameters. For example, a calibration curve based on the weak-field approximation may apply for a given range of magnetic fields but saturate for stronger ones (see Sections 2.4 and 3.1.2). But, when the problem can be assumed to be multidimensional, covariances appear because single observables rarely depend on just a single quantity (see Section 7). For example, a given spectral line Stokes V profile can seemingly grow or weaken by the same amount owing to changes in temperature or magnetic field strength (e.g., Del Toro Iniesta and Ruiz Cobo, 1996). An example can be seen in Figure 1, where two apparently equal V profiles come from two different atmospheres. With all these ingredients at hand, the astrophysical analysis of observations is a non-linear, fully involved, topological task where many decisions have to be made (the art ) and, hence, cannot be taken for granted. The techniques by which astronomers have obtained information about the physical conditions in the object have evolved in parallel to technological advancements; that is, to the available means we have of gathering such information. The community has gradually enhanced its knowledge from medium-band measurements including one or several spectral lines to very fine wavelength sampling of the four Stokes profiles of single or multiple spectral lines; from old curves of growth for equivalent widths to highly sophisticated techniques that include the solution of the radiative transfer equation (RTE). The finer the information, the more complete the physical description.
Following Socas-Navarro (2001), let us consider the simplest case of having a single observable parameter, the Doppler displacement with respect to the rest position of the spectral line, ∆λ, and a single physical quantity to derive, the line-of-sight (LOS) velocity, v LOS . Imagine that we measure ∆λ by finding the minimum (or the maximum in the case of an emission line) of the intensity profile. The biunivocal mapping between the one-dimensional space of observables -that containing all possible Doppler displacements-and the one-dimensional space of physical quantities -that of LOS velocities-is given by the Doppler formula where λ 0 stands for the vacuum rest wavelength position of the line and c for the speed of light. This simple inference relationship requires at least three implicit physical assumptions for the Doppler displacement to be properly defined and measured; namely that a) the solar feature is spatially resolved, b) the line is in pure absorption (or pure emission), and c) v LOS is constant along the LOS. First, if we have unresolved structures we cannot ascribe the inferred velocity to any of them. Second, lines with core reversals, either in absorption or in emission, do not qualify for the extremum-finding method. And third, as soon as we have an asymmetric profile, ∆λ can no longer be properly defined for the line but for a given height through the profile, and then the mapping in Eq.
(1) immediately loses its meaning. While in the case of a constant velocity, we properly infer that velocity, in the presence of gradients we infer a value corresponding only to the -in principle unknown-layers where the core of our line has been formed (typically the highest layers of the atmosphere). We measure a velocity but we do not know which one. Strictly speaking, the same measurement corresponds to different physical quantities depending on the assumptions. Of course we could complicate our problem a little and try to determine the stratification of LOS velocities with height, or simply estimate a gradient, by measuring the so-called bisector, the geometric position of those points equidistant from both wings of the profile at a given depth. At that point, our spaces have increased their dimensions and Eq. (1) is no longer the sole ingredient of our mapping because we must add some more physical assumptions to interpret the different displacements of the bisector in terms of velocities at different heights in the atmosphere. Hence, depending on the assumed physics, the quantitative results may change. This easy example has been used to illustrate that even the simplest inference is dependent on physical assumptions. This is an inherent property of astrophysical measurement and no one can escape from it: the same observable can mean different things depending on the assumed underlying physics. Most of the criticisms of the inversion techniques that are reviewed in this paper often come from this lack of uniqueness of the results. Many authors claim that the inversion of the RTE is an ill-posed problem. This being true, one should realize that astrophysics itself is indeed ill-conditioned, and this is a fact we have to deal with, either willingly or not. The physics connecting the object quantities with the observable parameters is of paramount significance and deserves a little consideration at this point. Radiative transfer is the discipline encompassing the generation and transport of electromagnetic radiation through the solar (stellar) atmosphere. Hence, the mapping between the two spaces will be based upon it and depend on its degrees of approximation. The specification of the radiation field through a scattering atmosphere was first formulated as a physical problem by Strutt (Lord Rayleigh) (1871a,b, 1881, 1899). In the astrophysical realm, the problem was posed in the works by Schuster (1905) and Schwarzschild (1906) without taking polarization into account. After that, although not known to the astrophysical community, Soleillet (1929) presented a theory of anisotropic absorption that is nothing but a rigorous formulation of the radiative transfer equation. Very importantly, he used the formalism proposed by Stokes (1852) to deal with partially polarized light. It was not, however, until the works by Chandrasekhar (1946aChandrasekhar ( ,b, 1947 that the transfer problem of polarized light was settled as an astrophysical problem on its own. The Stokes formalism has regularly been used since then in the astronomical literature. After Hale's (1908) discovery of sunspot magnetic fields, the interpretation of the solar (stellar) spectrum of polarized light became necessary and a full theory has been developed since the mid 1950s. The first modern formulation of an equation of radiative transfer for polarized light was presented by Unno (1956), who also provided a solution in the simplified case of a Milne-Eddington (ME) atmosphere. Only absorption processes were taken into account and a complete description had to wait until the works by Rachkovsky (1962aRachkovsky ( ,b, 1967, who also included dispersion effects (the so-called magneto-optical effects). These two derivations were phenomenological and somewhat heuristic. A rigorous derivation of the radiative transfer equation (RTE) based on quantum electrodynamics was obtained by Landi degl' Innocenti and Landi degl'Innocenti (1972). Later, four derivations of the RTE from basic principles of classical physics were published by Jefferies et al. (1989), Stenflo (1991; see also Stenflo 1994), Landi Degl'Innocenti (1992see also Landi Degl'Innocenti andLandolfi 2004), andDel Toro Iniesta (2003b). A discussion of the RTE and the several assumptions used in various available inference techniques is deferred to Section 2.
Certainly, any inference has to be based on solutions of the RTE because it relates the observable Stokes spectrum with the unknowns of the problem; namely, the physical quantities characterizing the state of the atmosphere they come from. No matter how simplified such solutions can be, it is natural to compare the observations with theoretical calculations in prescribed sets of physical quantities. The comparison of observational and synthetic parameters results in values for the sought-for quantities that may be refined in further iterations by changing the theoretical prescriptions. This trial-and-error method can be practical when the problem is very simple (involving a few free parameters) but can become unsuitable for practical use if the number of free parameters is large. Even automated trial-and-error -i.e., Monte Carlo-methods may fail to converge to a reliable set of physical conditions in the medium. Some more educated techniques are needed to finally work out that convergence between observed and synthetic parameters.
Generally speaking, any method in which information about the integrand of an integral equation is obtained from the resulting value of the integral is called an inversion method. In our particular case, it is straightforward to write the synthetic Stokes spectra as an integral involving a kernel that depends on the physical conditions of the atmosphere (see Eq. 8). In fact, the emergent formal solution of the RTE is the most basic type of integral equation, namely a Fredholm equation of the first type, because both integration limits are fixed. Consequently, we will call inversion codes or inversion techniques those methods that (almost) automatically succeed in finding reliable physical quantities from a set of observed Stokes spectra because we shall understand that they indeed automatically solve that integral equation. There is a whole variety of flavors depending on the several hypotheses that can be assumed, but all of them share the characteristic feature of automatically minimizing a distance in the topological space of observables. The idea had already been clearly explained in the seminal work by Harvey et al. (1972): "Solve for B on the bases of best fit of the observed profiles to the theoretical profiles". And the free parameters for such a best fit were found through least squares minimization of the profile differences. They obtained only an average longitudinal field component because their Stokes Q and U observations were not fully reliable and magneto-optical effects were not taken into account, but the fundamental idea underlying many of the current techniques can already be found in that very paper, including a simple two-component model to describe the possible existence of spatially unresolved magnetic fields.
In a thorough study using synthetic Stokes profiles, Auer et al. (1977) proposed a new inversion method based on Unno's theory and tested its behavior in the presence of several realistic circumstances, such as asymmetric profiles, magnetic field gradients, magneto-optical effects, and unresolved magnetic features. This technique was later generalized by Landolfi et al. (1984) to include magneto-optical and damping effects. The numerical check of the code was fairly successful but neither the original code by Auer et al. (1977) nor the new one by Landolfi et al. (1984) were applied to observations. Independently of the latter authors, the preliminary studies by , Lites and Skumanich (1985), and  jelled in what has been one of the most successful ME inversion codes so far by Skumanich and Lites (1987), later extended by Lites et al. (1988) to mimic a chromospheric rise in the source function (see Section 2.3). This code has been extensively used with observational data, most notably those obtained with the Advanced Stokes Polarimeter (Elmore et al., 1992).
Based on the thin flux tube approximation, Keller et al. (1990) proposed an inversion code for extracting physical information not from the Stokes profiles themselves but from several parameters calculated from I and V observations of a plage and a network. Two years later, Solanki et al. (1992a) presented a new inversion code whereby from the whole Stokes I and V profiles they selected among a handful of prescribed temperature stratifications and inferred height-independent magnetic field strength and inclination, Doppler shift, filling factor (surface fraction in the resolution element covered by magnetic fields), macro-and microturbulent velocities, and some atomic parameters of the spectral line. The very same year, Ruiz Cobo and Del Toro Iniesta (1992) introduced SIR, an acronym for Stokes Inversion based on Response functions. Like the former codes, SIR ran a non-linear, least-squares, iterative Levenberg-Marquardt algorithm but with a remarkable step-forward feature: physical quantities characterizing the atmosphere were allowed to vary with optical depth. The increase of free parameters can generate a singularity problem: the variation of some atmospheric parameters may not produce any change on the synthetic spectra or, in other cases, different combinations of the perturbation of several parameters may produce the same change in the spectra. The success of SIR lies in regularizing the problem through a tailored Singular Value Decomposition method (SVD). This allows, in principle, to look for any arbitrarily complex atmospheric stratification. The three components of the magnetic field, the LOS velocity, the temperature stratification, and the microturbulence may have any height profile. The code also infers height-independent microturbulent velocity and filling factor. The possibility exists for also fitting some atomic parameters (e.g., Allende Prieto et al., 2001). but they are typically fixed in practice. The code can be applied to any number of spectral lines that are observed simultaneously. SIR has been successful in a large number of observing cases and its use is still spreading among the community.
Following SIR's strategy (that is, using response functions, nodes, Levenberg-Marquardt, and SVD), an evolution of the Solanki et al. (1992a) code called SPINOR was presented by Frutiger and Solanki (1998) that also allowed for height variations of the physical quantities and included the possibility of multi-ray calculations assuming the thin flux tube approximation. Sánchez Almeida see also 1997) were presented. The first (based on an earlier code by Socas-Navarro et al. 1998 without taking either polarization or magnetic fields into account) included non-LTE radiative transfer (see Section 2.1), and the second was specifically designed for analyzing Stokes I and V profiles in terms of the thin flux tube approximation by using an analytic shortcut for radiative transfer proposed by Del Toro Iniesta et al. (1995, see Section 3.2.3). On their hand, Rees et al. (2000) proposed a Principal Component Analysis (PCA), which worked by creating a database of synthetic Stokes profiles by means of an SVD technique. In such a database, given eigenprofiles are obtained that are later used as a basis for expanding the observed Stokes profiles. Hence, the description of observations can be made with the help of a few coefficients, thus speeding up the inversion process. One year later, LILIA (LTE Inversion based on Lorien Iterative Algorithm), a code with similar properties as SIR, was presented by Socas-Navarro (2001) and FATIMA (Fast Analysis Technique for the Inversion of Magnetic Atmospheres), a PCA code, was introduced by Socas-Navarro et al. (2001). A different technique was proposed by , see also Socas-Navarro 2003 that used artificial neural networks (ANNs) whereby the system was trained with a set of synthetic Stokes profiles. The structure obtained therefrom finds the solution for the free parameters by interpolating among the known ones. Although the training can be slow, the inversion of observational data is very fast. In practice, both the synthetic training set of ANNs and the synthetic database of PCA have employed ME profiles to keep the implementation feasible. Otherwise, the number of free parameters would render the two techniques impracticable. A PCA 8 code to analyze the Hanle effect in the He i D 3 line was developed by López Ariste and Casini (2003, see also Casini et al., 2005.
A substantial modification of the original SIR code, called SIRGAUSS, was presented by Bellot Rubio (2003) in which the physical scenario included the coexistence of an inclined flux tubethat is pierced twice by the LOS-within a background. Such a scenario is used to describe an uncombed field model of sunspot penumbrae (Solanki and Montavon, 1993). An evolution of this inversion code, called SIRJUMP, was later used by Louis et al. (2009) that was able to infer possible discontinuities in the physical quantities along the LOS. A further code presented by Asensio Ramos (2004) was able to deal with the Zeeman effect in molecular lines. The very same year, Lagg et al. (2004) published HeLIx, an ME inversion code that dealt with the Hanle and the Zeeman effect in the He i line at 1083 nm. Another ME inversion code was presented by Orozco Suárez and Del Toro Iniesta (2007) with the helpful feature that was written in IDL, so that it is easily manipulated by relatively inexperienced users and employed as a routine in highlevel programming pipelines. Also in 2007, Bommier et al. took over the Landolfi et al. (1984) method and extended it to include unresolved magnetic structures. Unfortunately, they fail to obtain the magnetic field strength and the filling factor separately; only their product is reliable. Self-consistent levels of confidence in the ME inversion results were estimated through the code proposed by Asensio Ramos et al. (2007a) using Bayesian techniques. A rigorous treatment of optical pumping, atomic level polarization, level crossings and repulsions, Zeeman, Paschen-Back, and Hanle effects on a magnetized slab was included in HAZEL (Asensio , with which analysis of the He i D 3 and the multiplet at 1083 nm can be carried out. Oriented to its extensive use with the data coming from the Helioseismic and Magnetic Imager (Graham et al., 2003) aboard the Solar Dynamics Observatory,  presented VFISV (Very Fast Inversion of the Stokes Vector), a new ME code but with several further approximations and simplifying assumptions to make it significantly faster than other available codes. Mein et al. (2011) presented an alternative inversion code in which, with a significant number of simplifying assumptions on top of the ME approximation (such Stokes I profiles being Gaussians and magneto-optical effects being almost negligible), some moments of the Stokes profiles are used to retrieve the vector magnetic field and the LOS velocity. In 2012, a significant step forward was provided by van Noort, who combined spectral information with the known spatial degradation effects on two-dimensional maps to obtain a consistent restoration of the atmosphere across the whole field of view. An aim similar to van Noort's is followed by Ruiz Cobo and Asensio Ramos (2013), who, by means of a regularized method (indeed based on PCA), deconvolve the spectropolarimetric data that are later inverted with SIR. Based on the concept of sparsity, Asensio  have proposed a novel technique that allows the inversion of two-dimensional (potentially three-dimensional) maps at once.
The interested reader can complement this chronological overview with the reviews by Del Toro Iniesta and Ruiz , Socas-Navarro (2001), Del Toro Iniesta (2003a), Bellot Rubio (2006, and Asensio Ramos et al. (2012) and the didactical introductions and discussions by Stenflo (1994), Del Toro Iniesta (2003b, and Landi Degl'Innocenti and Landolfi (2004). A critical discussion on the different techniques and the specific implementations will be developed throught the paper, which is structured as follows: the basic assumptions of radiative transfer are discussed in Section 2; the following two Sections discuss the approximations used for the model atmospheres and the Stokes profiles; an analysis of the forward problem, namely the synthesis of the Stokes spectrum, is presented in Section 5, which is followed by an analysis of the sensitivities of spectral lines to physical quantities (Section 6); the basics of inversion techniques are analyzed in Section 7 and a discussion on inversion results presented in Section 8; finally, Section 9 summarizes the conclusions. An appendix proposes an optimum way of initializing the inversion codes through the use of classical estimates. 9 2 Radiative transfer assumptions The propagation of electromagnetic energy through a stellar atmosphere -and its eventual release from it-is a significantly complex, non-linear, three-dimensional, and time-dependent problem where the properties of the whole atmosphere are involved. From deep layers up to the stellar surface, the coupling between the radiation field and the atmospheric matter implies non-local effects that can connect different parts of the atmosphere. In other words, the state of matter and radiation at a given depth may depend on that at the other layers: light emitted at one point can be absorbed or scattered at another to release part or all of its energy.
The description of the whole system, matter plus radiation field, needs to resort to the solution of the coupled equations that describe the physical state of the atomic system and that of the radiation traveling through it. Therefore, we have to simultaneously solve the so-called statistical equilibrium equations and the radiative transfer equation. The first assumption we shall make is that radiative transfer is one dimensional; that is, that the transfer of radiative energy perpendicular to the line of sight can be neglected in the matter-radiation coupling. For most solar applications so far, this assumption has been seen to be valid. Since the purpose of this paper is not directly related to either of the two systems of equations, let us simply point out what their main characteristics and ingredients are, and how the whole problem can be simplified in different situations. We refer the interested reader to the book by Landi Degl'Innocenti and Landolfi (2004) for a full and rigorous account of all the details.
Most classical radiative transfer descriptions in the literature do not deal with polarization. They are typically qualified as radiative transfer studies for unpolarized light but the name is ill-chosen. Formally speaking, those analyses are for light traveling through homogeneous and isotropic media (Del Toro Iniesta, 2003b). As a consequence of that heritage, the community is used to speak about atomic level populations either calculated through the Boltzmann and Saha equations (the LTE approximation; see Sect. 2.2) or not (the non-LTE case; see Section 2.1). These isotropic descriptions of the transfer problem, however, are not valid when a physical agent such as a vector magnetic field establishes a preferential direction in the medium, hence breaking the isotropy. Moreover, the outer layers of a star are a clear source of symmetry breaking. The exponential density decrease with height makes the radiation field anisotropic: outward opacity is much smaller than inward opacity. This should also be the case with collisions between particles: they are more probable at the bottom than at the top of the atmosphere. In such a situation, the probability is not zero for the various degenerate levels of the atom (with respect to energy) to be not evenly populated and for non-zero coherences or phase relations between them to exist. The atomic system is then said to be polarized and its state is best described with the so-called density operator, ρ, that provides the probabilities of the sublevels being populated (hence the populations) along with the possible correlations or interferences between every pair. In the standard representation that uses the eigenvectors of the total angular momentum, J 2 , and of its third component, J z , as a basis, the density matrix element represents the coherence or phase interference between the different magnetic sublevels characterized by their angular momentum quantum numbers. In Eq.
(2), α and α ′ stand for supplementary quantum numbers relative to those operators that commute with J 2 and J z . Certainly, the diagonal matrix elements ρ α (jm, jm) ≡ ρ(αjm, αjm) represent the populations of the magnetic sublevels and the sum accounts for the total population of the level characterized by the j quantum number.
At all depths in the atmosphere, evolution equations for these density matrix elements have to be formulated that describe their time (t) variations due to the transport of radiation, on the one hand, and to collisions among particles on the other. All interactions with light -namely, pure absorption (A), spontaneous emission (E), and stimulated emission (S)-have to be considered. All kinds of collisions -namely, inelastic (I), superelastic (S), and elastic (E) collisions-have to be taken into account. Inelastic collisions induce transitions between any level |αjm and an upper level |α u j u m u with a consequent loss in kinetic energy. Superelastic collisions induce transitions to a lower energy level |α l j l m l with an increase in the kinetic energy of collision. Finally, elastic collisions induce transitions between degenerate levels |αjm and |αjm ′ ; in these, the colliding particle keeps its energy during the interaction. The statistical equilibrium equations (4) and (5) that follow for radiative and collisional interactions, respectively, have slightly different application ranges. The former are valid for the multi-term atom representation and can even be used in the Paschen-Back regime, while the latter are only valid for the special case of the multi-level atom representation (although they can be generalized to the multi-term representation). 2 We make them explicit here for illustrative purposes only and refer the interested reader to the Landi Degl'Innocenti and Landolfi's (2004) monograph for details. According to that work, the radiative interaction equations in the magnetic field reference frame 3 can be written as where ν α (jm, j ′ m ′ ) is the frequency difference between the two sublevels and the T 's and R's are radiative rates of coherence transfer and relaxation among the sublevels, respectively. Now, the collisional interactions give where the C's are collisional transfer rates between levels and the X's are relaxation rates. The indices refer to the corresponding type of collisions and the asterisk denotes the complex conjugate.
With the standard notation for the Stokes pseudo-vector I ≡ (I, Q, U, V ) T , where index T stands for the transpose, the radiative transfer equation can be written as (e.g., Del Toro Iniesta, 2003b) dI where τ c is the optical depth at the continuum wavelength, K stands for the propagation matrix, and S is the so-called source function vector. Since the continuum spectrum of radiation can safely be assumed flat within the wavelength span of a spectral line and non-polarized as far as currently reachable polarimetric accuracies are concerned, the optical depth, defined as is the natural length scale for radiative transfer. Note that the origin of optical depth (τ c = 0) coincides with the outermost boundary of geometrical distances (s lim ) and is taken where the observer is located so that τ c 's are actual depths in the atmosphere. In Eq. (7), χ cont is the continuum absorption coefficient (the fraction of incoming electromagnetic energy withdrawn from the radiation field per unit of length through continuum formation processes). The propagation matrix deals with absorption (withdrawal of the same amount of energy from all polarization states), pleochroism (differential absorption for the various polarization states), and dispersion (transfer among the various polarization states). The product of K and S accounts for emission. The RTE can then be considered as a conservation equation: the energy and polarization state of light at a given point in the atmosphere can only vary because of emission, absorption, pleochroism, and dispersion. Equation (6) is strictly valid only under the assumption that the energy and polarization state of light are independent of time. To be more specific, we have assumed that the rate of change of the Stokes parameter profiles is much slower than the radiative and collisional relaxation time scales involved in the problem. A formal solution to the general RTE was proposed for the first time by Landi Degl'Innocenti and Landi Degl'Innocenti (1985), according to whom, the observed Stokes profiles at the observer's optical depth (τ c = 0) read where O is the so-called evolution operator, and a semi-infinite atmosphere has been assumed as usual. The solution is called formal because it is not a real solution as long as the evolution operator (and the propagation matrix and the source function vector) are not known. Unfortunately, no easy analytical expression can in general be found for O. Only in some particular cases, such as that in Sect. 2.3, can a compact form for the evolution operator and an analytic solution of the RTE be obtained. In all other cases, numerical evaluations of O and solutions of the transfer equation are necessary. The emergent Stokes spectrum is obtained through an integral of a product of three terms all over the whole atmosphere. Claiming that some of the Stokes parameters are proportional to one of the matrix elements of K is, at the very least, adventurous. This proportionality can only take place in very special circumstances (e.g., Sections 2.4 and 3.1.2).

The non-local thermodynamic equilibrium problem
Being a vector differential equation, the RTE should indeed be considered as a set of four coupled differential equations. These can only be solved independently in specific media, either isotropic or very simplified ones. But the situation is far more complicated since both K and S depend on the material properties described by ρ α (jm, j ′ m ′ ), as well as on external fields such as a macroscopic velocity or a magnetic field. For their part, the radiative and collisional transfer and relaxation rates do depend on the radiation field. Therefore, Eqs. (4), (5), and (6) describe a very involved, non-local, non-linear problem, known as the non-local thermodynamic equilibrium (NLTE) problem and must be consistently solved altogether. The numerical solution of all those coupled equations requires iterative procedures that are summarized in Figure 2. By a model atmosphere we understand the set of thermodynamic variables (usually two, e.g., temperature and pressure, T and p), dynamic (the macroscopic, bulk line-of-sight velocity field, v LOS ), magnetic (the vector field B, represented by B, the strength, γ, the inclination with respect to the LOS, and ϕ, the azimuth), and possibly some other, ad hoc variables (such as the micro-and macroturbulence velocities, ξ mic and ξ mac , the filling factor, f -the area fraction of the resolution pixel that is filled with the unknown atmosphere-and so forth). All these variables have to be specified as functions of the optical depth. Numerically, that model can be represented by a vector x of np + r components, n being the number of depth grid points throughout the atmosphere, p the number of physical quantities varying with depth, and r the number of quantities that are assumed constant throughout the LOS. For example, one such model atmosphere would look like x ≡ [T (τ 1 ), T (τ 2 ), . . . , T (τ n ), p(τ 1 ), p(τ 2 ), . . . , p(τ n ), B(τ 1 ), B(τ 2 ), . . . , B(τ n ), γ(τ 1 ), γ(τ 2 ), . . . , γ(τ n ), ϕ(τ 1 ), ϕ(τ 2 ), . . . , where we have assumed specifically that both micro-and macroturbulence (as well as the filling factor) are constant with depth. This assumption is based on the fact that experience teaches that the increase in spatial resolution reached with new instruments makes less and less necessary the use of such ad hoc parameters. Once this model atmosphere is set, the necessary ingredients for the RTE and the statistical equilibrium equations can be calculated. The solution of the RTE has to be compared with that coming from it after modification driven by the new density matrix elements resulting from the solution of the statistical equations. If the differences are considered small compared with a given threshold, then a new synthetic set of Stokes parameters has been found. If not, the equilibrium equations have to be modified in order to iterate the procedure until convergence is reached. The direct problem of obtaining the Stokes spectrum of a given line coming out from a given model atmosphere then turns out to be very complex. It cannot always be computed with the necessary speed and accuracy. Approximations are, thus, in order.

The local thermodynamic equilibrium approximation
Imagine now that coherences among the Zeeman sublevels can be neglected, and that all of them are evenly populated. That is, assume that where δ is Kronecker's delta. In such conditions, n j = (2j + 1)ρ αj . Assume also that n j and the population of other ionic species can be evaluated through the equations of thermodynamic equilibrium at the local temperature (the Boltzmann and Saha laws; e.g., Gray, 2005). This assumption will be valid only in the case that the photon mean free path (ℓ = 1/χ cont ) 4 is small compared to the scale of variation of the physical quantities, i.e., when the atomic populations depend only upon the values of the local physical quantities. Besides, it can be shown that if Kirchoff's law is further assumed, (e.g., Landi Degl'Innocenti and Landolfi, 2004) the source function vector reduces to where, B ν (T ) is the Planck function at the local temperature. These are the conditions of the so-called local thermodynamic equilibrium approximation (LTE) and have automatically decoupled the RTE from the material equations. Then, if LTE can be supposed for a given spectral line, the synthesis of its Stokes profiles simplifies significantly because iterative procedures are no longer needed. This is graphically explained in Figure 3. In some circumstances, it may be useful to relax the fulfillment of the Boltzman law and, instead, admit that ρ αj deviate from the LTE values,ρ αj , so that are departure coefficients that measure how far the conditions are from LTE. Thus, although radiative transfer remains with the LTE scheme sketched in Fig. 3, the second block is affected by Eq. (12) and the β's are needed to calculate the level populations. As we are going to see, this departure-coefficient approximation can be very useful for formulating NLTE inversion procedures (see Section 7.2.3).

The Milne-Eddington approximation
An even more simplified approximation is obtained by further assuming that thermodynamics is sufficiently described with a source function that depends linearly on the continuum optical depth, where e 0 ≡ (1, 0, 0, 0) T , and that the other physical quantities (B, v LOS , etc.) in the model are constant throughout the atmosphere, hence defining a constant K. Figure 4 shows the LTE source function (the first component of the vector in Eq. 11) at 525 nm for several realistic model  Solanki (1986, blue line), and the quiet-Sun models by Gingerich et al. (1971, yellow line) and Vernazza et al. (1981, green line). The hypothesis of linearity does not seem very accurate for all the models. Nevertheless, in spite of its seemingly unrealistic nature, when we are dealing with a weak spectral line, the optical depth interval at which the line is sensitive to the atmospheric quantities is usually small enough to consider that a linear source function is not a bad approximation. There is wide experience in showing how useful the ME approximation is for inferring average values of the magnetic field vector and the LOS velocity, starting with the paper by Skumanich and Lites (1987, for a check with other approaches see Westendorp Plaza et al., 1998). The key point is that the RTE has an analytic solution (Stokes I at τ c = 0) under these assumptions (e.g., Del Toro Iniesta, 2003b): The analytic character of the solution helps in grasping many of the relevant features in line formation; it cannot reproduce Stokes line asymmetries, 5 though (Auer and Heasley, 1978). Using this useful feature, Landi Degl'Innocenti and Landi Degl'Innocenti (1985) had the clever idea of tailoring the functional shape of the source function so that it might be used to synthesize chromospheric line profiles while preserving an analytic solution because of the constancy with depth of the propagation matrix. Atomic polarization is neglected in this modeling. The so-called "field-free approximation" is assumed. The latter grants substitution of the scalar components of the source function for those corresponding to the same atom in the absence of a magnetic field (Rees, 1969).
Later on, Lites et al. (1988) elaborated Landi Degl'Innocenti and Landi Degl'Innocenti's idea and proposed a new source function that was incorporated into their inversion code to interpret the observed profiles of the Mg i b lines at 517.27 and 518.36 nm. Specifically, they wrote the RTE in terms of the line center optical depth, τ 0 , which remains the same as in Eq. (6) but substituting K by K ′ ≡ r 0 K, where r 0 is the continuum-to-line absorption coefficient ratio and with a new source function S ′ that follows from two distinct continuum and line source functions given by where S is defined in Eq. (13). The exponential shape of the last two terms in S lin tries to mimic the consequences in the source function of the actual chromospheric rise of temperature. The A's and ε's are free parameters that can be tuned to fit the observed profiles. With this formulation, the analytic solution of the transfer equation (at τ 0 = 0) turns out to be where 1l stands for the identity 4 × 4 matrix. 6 Further exploiting the analytic character of the Milne-Eddington solution, slight modifications in the assumptions were also suggested by Landolfi and Landi Degl'Innocenti (1996) to deal with small velocity gradients and even with discontinuities along the LOS. In summary, we can say that approximations to the RTE predicated on keeping the K matrix constant or almost constant are useful and still a field for exploitation in observational work.

The weak-field approximation
A further simplification of radiative transfer is sometimes used. When the magnetic field can be assumed constant with depth and weak enough, the resulting Stokes V profile of many lines turns out to be proportional to the longitudinal component of the field, regardless of the remaining physical quantities (see Section 3.1.2). Under this assumption (and for not extremely weak fields since linear polarization is zero to first order approximation), the ratio between Stokes U and Q is proportional to the tangent of twice the field azimuth. The weakness of the field is guaranteed provided that (e.g., Landi Degl'Innocenti and Landolfi, 2004) where g eff is the effective Landé factor of the line, ∆λ B is the Zeeman splitting, and ∆λ D is the Doppler width of the line. The effective Landé factor is given by where g u and g l are the Landé factors of the upper and lower level of the transition, respectively. In LS coupling, those factors are functions of the quantum numbers: The Zeeman splitting is given by where λ 0 is the central, rest wavelength of the line, e 0 and m are the charge and mass of the electron, B is the magnetic field strength, and c stands for the speed of light. For its part, the Doppler width is given by where T is the temperature, k is the Boltzmann constant, and m a is the mass of the atom. From a formal point of view, Eq. (17) is a good conditioning inequality. However, in practical terms, one should establish what is meant by much less than 1. This is addressed in Section 3.1.2 but we can be sure that the wider the line, the more the weak-field approximation applies. Hence, broad chromospheric lines are good candidates for using it. One of the first attempts at measuring a magnetic field with a chromospheric line, known to the authors of this review, was carried out as early as 1990 by Martínez Pillet et al. who (photographically) observed Stokes I and V profiles of the Ca ii H line and interpreted them in terms of the weak-field approximation. This approach remains useful as interest in the chromosphere increases (e.g., de la Cruz Rodríguez et al., 2013).

The MISMA hypothesis
Driven by the ubiquitous appearance of Stokes profile asymmetries in observations, Landi Degl'Innocenti (1994) suggested considering the atmospheric physical quantities, instead of deterministic stratifications, to have stochastic distributions about mean values with possible correlation effects among them. Assuming that the source function nevertheless varies linearly with depth through the whole atmosphere and that the propagation matrix stays constant at the spatial scale of each of the realizations of such a common stochastic distribution, he found an analytic solution for the transfer equation. Certainly inspired by the Landi Degl'Innocenti's proposal, Sánchez Almeida et al. (1996) put forward a new approach. Realizing that the wavelength symmetries in the propagation matrix elements do indeed avoid such Stokes profile asymmetries in the absence of LOS velocity gradients in the regular formulation of the transfer problem (Landi Degl'Innocenti, 1992), they proposed that the solar atmosphere may be pervaded by MIcro-Structured Magnetic Atmospheres (MISMAs). The hypothesis implies a highly inhomogeneous atmosphere at scales much smaller than the photon mean free path whereby the integration of Eq. (6) turns out to be very difficult. An alternative formulation is thus in order by locally averaging the propagation matrix and the emission vector. The resulting equation reads dI ds It formally looks very much like the regular RTE but is formulated in terms of geometrical distances, and the averages are taken over a distance ∆s that may vary along the optical path. The distance ∆s is supposed to be still smaller than ℓ for Stokes I to be assumed constant within its range. In addition, the averages are considered to vary smoothly along the line of sight. With all these assumptions, Eq. (22) is formally the same as Eq. (6). All the mathematical tools developed to solve the latter can be used to find a solution to the former. This is so despite the (numerically) inconvenient formulation in terms of geometrical distances: it requires either non-equally-spaced grid points or an increase in computation time. The good news is that, since correlations may exist among the physical parameters of the microstructures, the symmetry properties of matrix K ′ are automatically destroyed. Hence, asymmetric Stokes profiles can appear naturally.

Degrees of approximation in the model atmospheres
Provided that physical atmospheric quantities are bounded functions of the optical depth, we can safely expect that they are either continuous or have some jump (Heaviside-like) discontinuities throughout the line formation region. Therefore, except for the discontinuity points, a Taylor expansion approximation seems simple and sensible. The good feature of Taylor expansions is that you can keep them at a given order of approximation that can be subsequently increased if needed. The sequential approach is of great help in following the principle of Occam's razor -lex parsimoniae-which, in our opinion, should prevail in the interpretational work. The question arises as to whether an order of approximation is useful or whether it should be increased to give account of the observations. The answer must be found in the degree of accuracy with which we are trying to reproduce the observables. Hence, it has to do with the balance between the signal and the noise: if the next order of approximation only introduces variations that are below, say, three times the rms noise, σ, then its use is discouraged. If, on the contrary, the difference between the observed and synthetic profiles is greater than 3σ, its use may be advisable. 7 Let us postpone the discussion to the following sections and present here the various atmospheres we are considering. We start with the zeroth order approximation and assume that physical quantities are constant with depth to continue with gradients, higher order variations, and jumps or discontinuities. 7 By adopting σ as a measure of noise we are assuming that the noise statistics is Gaussian and this seems a common and sensible assumption as well. Requiring signals to be larger than 3σ, therefore, implies more than 99.7 % certainty in the detection. We refer the reader to Del Toro Iniesta and Martínez Pillet (2012) for a discussion on polarimetric accuracy and signal-to-noise ratio. For Bayesian selection among model atmospheres, see Asensio Ramos et al. (2012).

Constant physical quantities
Let us distinguish among three possibilities, namely, the Milne-Eddington approximation, the weak-field approximation, and an atmosphere where B and v LOS are constant but where thermodynamics is properly accounted for with a realistic stratification of temperature. 8

The Milne-Eddington atmosphere
As commented on in Sect. 2.3, a Milne-Eddington atmosphere provides an analytic solution to the RTE. With nine parameters, the Stokes profiles of a spectral line can be synthesized. The model parameters are the three components of the magnetic field, B, γ, and ϕ, the LOS velocity, v LOS , and the so-called thermodynamic parameters: the line-to-continuum absorption coefficient, η 0 (= 1/r 0 ), the Doppler width of the line, ∆λ D , the damping parameter, a, and the two coefficients for the source function, S 0 and S 1 . The actual values of η 0 , ∆λ D , and a may vary significantly throughout the atmosphere. Therefore, assigning one single value for each may be, say, risky. Experience, however, indicates that this is possible. Reasonable fits to actual data can be obtained with this approximation and we can even understand the relationship between the single-valued parameters and their actual stratification (Westendorp Plaza et al., 1998). Only Stokes profiles with definite symmetry properties can be formed in an ME atmosphere. Stokes I, Q, and U are even functions of wavelength while Stokes V is odd. This is a consequence of the absence of velocity gradients (Auer and Heasley, 1978) and will be discussed later in Section 5. Figure 5 shows two examples of ME profiles corresponding to the Fe i line at 617.3 nm as observed with an instrument whose (Gaussian) spectral profile (point spread function, PSF) has a full width at half maximum (FWHM) of 6 pm. The thermodynamic model parameters are η 0 = 5.06, ∆λ D = 2.6 pm, a = 0.22, S 0 = 0.1, and S 1 = 0.9; they come from a fit to the FTS spectrum (Kurucz et al., 1984;Brault and Neckel, 1987). The magnetic inclination and azimuth are both equal to 30 • ; B = 1200 G for the black lines and 200 G for the red ones.

The weak-field atmosphere
As stated in Sect. 2.4, when B is constant with depth and very weak, then the Stokes V profile turns out to be proportional to the longitudinal component of the magnetic field independently of the remaining quantities. It can be shown (e.g., Landi Degl'Innocenti and Landolfi, 2004) that where I nm is the non-magnetic Stokes I profile, corresponding to the line in the absence of a magnetic field. Equation (24) has been key for many magnetic inferences. In fact, written as V = CB , it is known as the magnetographic equation since it provides a calibration of the magnetographic signal. When magnetographs used only one or two wavelength samples of the circular polarization, the magnetographic equation was indeed the only means of obtaining estimates of the component of the magnetic field along the line of sight. Nowadays, with modern magnetographs providing more samples in all four Stokes parameters, that equation is still useful for morphological, qualitative estimates but cannot be trusted everywhere and under all circumstances. The modern way to evaluate C indeed implies some radiative transfer calculations in given model atmospheres (e.g., Martínez Pillet et al., 2011), and these calculations readily show that the approximation saturates at low magnetic field strengths. In the left panel of Fig. 6, we plot the maximum of the Stokes V profile as a function of the field strength (the field is along the LOS, γ = 0 • ) with an instrumental profile FWHM of 6 pm (asterisks) and of 8.8 pm (diamonds). In solid lines, the linear (red) and quadratic (blue) fits are also shown. Only strengths up to 600 G are plotted because the relationship is evidently nonlinear above that threshold. For weaker fields, it is apparent that the instrumental broadening of the profiles helps linearity to hold as differences between the linear and quadratic fits are smaller for the broader PSF. Those differences are for most of the points above 3 · 10 −3 I c ; that is, more than 3σ, with σ being the noise level of the polarization continuum signal of typical observations. Such differences are clearly detectable by current means. Hence, the approximation loses validity for yet weak fields. Deviations from linearity are even clearer if one sees the green lines in the figure, which correspond to linear fits including only data points for which B is less than 200 G. In our example, the weak field approximation for the Stokes V peaks breaks down at fields stronger than 300 G with a FWHM of 6 pm and stronger than 400 G with a FWHM of 8.8 pm. Certainly, if the instrument has a narrower spectral PSF or if the noise is smaller, the approximation fails earlier. The approximation clearly worked better for older instruments. Further arguments can be supplied for the user to be cautious about weak field assumptions with typical, visible photospheric lines. The first one is that Eq. (24) is hardly applicable, as shown in Fig. 7, not only because Stokes V does not follow it but because Stokes I deviates from I nm even sooner (and, up to first order, I = I nm must hold for Eq. (24) to be valid; e.g., Landi Degl'Innocenti and Landolfi, 2004). In the left column of the figure, the differences between the left-hand and the right-hand members of the equation are plotted. Colors correspond to 600 G (black), 500 G (red), 400 G (blue), 300 G (green), 200 G (purple), and 100 G (dark green). The dashed, horizontal purple lines mark the 3σ level. The upper rows are for a FWHM of 6 pm and the bottom row is for a FWHM of 8.8 pm. The plots in the left column are of course consistent with the results from Figure 6. Those in the right column are illustrative of how Stokes I varies with the magnetic field strength. Differences between the various profiles can easily be discerned above the 3σ level. When the profiles themselves are affected by noise, unlike in these plots, detecting the differences may be more difficult but the message is clear: contrary to the common belief, the Stokes V profile is not the only tool for estimating the longitudinal component of weak magnetic fields; Stokes I helps a lot and should not be forgotten.
The second argument concerns the diagnostic capability for typical lines to disentangle B from γ in the weak-field regime. Most statements about the only accurate retrieval to be the longitudinal magnetic field component are based on Eq. (24), as if it were the only available tool from radiative transfer. Stokes profiles other than V are often obliterated. It is easy to understand (e.g., Landi Degl'Innocenti and Landolfi, 2004), however, that the mere deviations between I and I nm we have seen in Fig. 7 should imply the appearance of linear polarization signals (provided that the inclination is different from zero): such Stokes I deviations from I nm are second order terms in an expansion of all four Stokes profiles. 9 At second order, Stokes Q and U are no longer zero (or below the noise) either and start to provide additional information. It can also be proven (e.g., Landi Degl'Innocenti and Landolfi, 2004) that Q ∝ B 2 sin 2 γ, as shown in the right panel of Fig. 6, where the maximum of Stokes Q is plotted against B 2 for a field that is inclined 45 • with respect to the vertical. 10 Here, deviations between linear and quadratic fits are smaller than for the V case (note that the Y scale is an order of magnitude smaller) but the interesting point is that, above B = 200 G, linear polarization signals begin to be larger than 3σ and, hence, detectable.  Figure 7: Differences between the Stokes V profile and its weak-field approximation (left column) and differences between the Stokes I profile and that for a zero field strength. Colors indicate values of the longitudinal component of the field. The dashed horizontal lines mark the 3σ level of typical, modern observations. The upper row is for a FWHM of 6 pm and the bottom one for a FWHM of 8.8 pm. Colors correspond to 600 G (black), 500 G (red), 400 G (blue), 300 G (green), 200 G (purple), and 100 G (dark green).
A third argument we want to bring to the reader's attention is related to the common belief that weak fields are hardly distinguished from strong fields (say above 1 kG) with a filling factor significantly smaller than 1. We will return to this issue in Secs. 6.2 and 8.2, as the problem has already been discussed in the literature (e.g., Del Toro Iniesta et al., 2010). Let us only mention here that the loss of linearity of Stokes V above, say, 400 G and, most importantly, the behavior of Stokes I are reasons enough for the two types of atmospheres to be distinguished by observational means.
If Eq. (24) were universally accepted, then it would indicate that the RTE is almost useless since the emergent profile is proportional to one of the matrix elements of K. Elementary mathematics readily explain that this is not possible except for, perhaps, a small value range of fields. In summary, we must acknowledge that Stokes V is not proportional to the longitudinal component of the magnetic field.

Constant vector magnetic field and LOS velocity
There is still a third option to deal with constant B and v LOS . Imagine that the atmosphere is a regular one as far as thermodynamics is concerned but where the magnetic and dynamic quantities do not vary with depth. Since the propagation matrix is no longer constant, no analytic solution of the RTE is available. 11 One is then led to use numerical techniques to synthesize the spectrum. The atmosphere, however, is greatly simplified since the number of parameters is reduced. This can be very helpful for quicker analyses of the data or as a makeshift for more elaborate subsequent approaches that include variations of B and v LOS with the optical depth. This is the approximation used for the first version of the SPINOR code (Solanki et al., 1992a) or as an option in the SIR code (Ruiz Cobo and Del Toro Iniesta, 1992).

Physical quantities varying with depth
The community has gathered a great deal of evidence about variations of B and v LOS along the optical path everywhere over the solar disk. In addition, physical laws such as those of magnetic flux and mass conservations demand that these quantities vary with optical depth in a number of structures. The approximations in the former subsections cannot then be considered but as first-step approaches or simplified descriptions of reality. In any case, we can safely assume that stratifications of the physical quantities are bounded functions of τ c (or whichever variable parameterizing the optical path), as we admitted in the beginning of this Section.
A historical landmark for the full acknowledgement of LOS velocity gradients from an observational point of view was established by the discovery by Mickey and Orrall (1974) and Illing et al. (1974aIlling et al. ( ,b, 1975) of a broadband circular polarization in sunspots. The true explanation was already suggested in the last of those papers, although schematically founded on the assumption of two slabs with different velocity and magnetic field strengths. The broadband observations were soon related to spectral line net circular polarization (the integral of the Stokes V profile over the wavelength span of the line): Grigorjev and Katz (1975) computed all four Stokes profiles in the presence of an LOS velocity gradient and certainly obtained asymmetric profiles; later on, Auer and Heasley (1978) demonstrated that a necessary and sufficient condition for such a net circular polarization had to be found in velocity gradients along the line of sight, although they were neglecting magneto-optical effects. Rigorous derivations (including dispersion effects) have later been obtained and can be found, for example, in the elegant work by Landi Degl'Innocenti and Landi Degl'Innocenti (1981). The symmetry properties of the propagation matrix elements predict no net circular polarization (or Stokes V area asymmetry) in the absence of an LOS velocity gradient. Other mechanisms such as insufficient spatial resolution that implies mixtures of individual atmospheres within a pixel, may produce asymmetries in the peaks (the so-called amplitude asymmetries) but the integral of V will remain zero. Therefore, any net circular polarization is unambiguous observational evidence for the presence of velocity gradients. And Stokes V area asymmetries are observed practically everywhere. Unfortunately, no such unambiguous evidence exists for the presence of magnetic field gradients, although we know on physical grounds there are plenty of them, such as those through magnetic canopies where a magnetic layer is overlaying a non-magnetic one.

Parameterizing the stratifications
Among the numerical codes relevant to this review (see Sect. 7) there are some that acknowledge variations of B and v LOS . We deal here with what might be called "normal" or "regular" stratifications, such as those employed by Ruiz Cobo and Del Toro Iniesta (1992), Frutiger and Solanki (1998), Socas-Navarro et al. (2000), and Socas-Navarro (2001), and leave some others, devoted to specific solar features, to the following paragraphs.
Since the number of depth grid points used for the numerical integration of the RTE can be high, it may be advisable to reduce the degrees of freedom of the variations with depth of the physical quantities. As commented on above, a reasonable approach would be to follow higher order polynomials in a stepwise form. From constant values to linear, parabolic, third-order polynomial dependences, and so on. Then, if we assume, for instance, that v LOS is linear with τ c , we only need to specify the velocity at two grid points (nodes in SIR's terminology) and three if it is parabolic, hence reducing the number of free parameters of the model. We do not need to specify T , B, and v LOS at every single point we use for solving the RTE but only at a few of them. We shall see in Sect. 7 that one can go even further with this kind of approach and consider more involved optical depth dependences if necessary.

The MISMA atmosphere
As we explained in Sect. 2.5, the MISMA hypothesis guarantees the appearance of Stokes profile asymmetries but at the expense of introducing a significant number of extra free parameters. In fact, even in the simplest MISMA atmosphere , where all the micro-structures are described by ME atmospheres, one has in principle as many as ten free parameters per needed component (also known as micro-structure). To the nine regular ME parameters, the volume occupation fraction for each micro-structure must be added. In more complicated MISMAs, the number of parameters is even higher (Sánchez Almeida, 1997). Moreover, in spite of the very detailed physical description where equilibrium equations are required for slender flux tubes, the inclination and azimuth of the magnetic field are kept constant throughout the whole atmosphere, which does not seem very realistic. (Canopies are found almost everywhere owing to the fanning out of magnetic field lines with height.) Last, but not least, when the structuring of the atmosphere is established at sizes comparable to ℓ, the average propagation matrix does not result in an RTE as in Eq. (22), which is no longer valid. Modern observations with continuously increasing spatial resolution do indeed show this kind of structuring both in quiet and active regions and sunspots. For example, single magnetic flux tubes of approximately 150 km size have been fully resolved by Lagg et al. (2010); their evolution followed for half an hour by Requerey et al. (2014); and the internal structure of network magnetic structures revealed  with Sunrise/IMaX observations (Martínez Pillet et al., 2011;Barthol et al., 2011). In our opinion, the MISMA hypothesis, being a clever idea for producing asymmetries, is advisable as a "when-all-else-fails" atmosphere but there are yet conventional radiative transfer treatments that provide reasonable interpretation of the observations.

Other special atmospheres
This subsection is devoted to three special cases where the physical scenario envisaged to explain the observations requires a specific configuration that is not intended to be universally valid. Those specific configurations, however, help in interpreting the Stokes profiles emerging from given solar features.
Interlaced atmospheres Imagine that you can assume that your line of sight is piercing a number n of alternate boundaries {s i } i=1,...,n (s 1 < s 2 < . . . < s n ) between two distinct atmospheres, as when observing from a side two identical thin flux tubes that are close but not stuck to each other. In such a scenario, the structuring of the atmosphere is comparable in size with ℓ and, therefore, the MISMA hypothesis does not hold. If you happen to know the solution, I ±1 , of the RTE in each of the two atmospheres, labeled ±1, Del Toro Iniesta et al. (1995) found out that the formal solution to the problem is for any s ∈ [s n , s lim ], where the +1 atmosphere is assumed to be the outermost one, ∆I(s i ) ≡ I −1 (s i )−I +1 (s i ), and O ±1 are the evolution operators for both atmospheres (e.g. Landi Degl' Innocenti and Landi Degl'Innocenti, 1985). Equation (25) is at the root of the flux-tube inversion code by Bellot Rubio et al. (1997, see Bellot Rubio et al., 1996. A different treatment of discontinuities along the line of sight was proposed by Borrero et al. (2003) where the density of depth grid points is increased in the discontinuity neighborhood.
Atmospheres with Gaussian profiles The existence of net circular polarization in the penumbrae of sunspots was also the driver for Bellot Rubio (2003) to propose an implementation of the uncombed model by Solanki and Montavon (1993). The scenario is based on two components; namely, a magnetic component and a penumbral magnetic flux tube, the latter occupying a fractional area of the resolution element. The model parameters of the penumbral tube are built by Gaussian modifications (in depth) of those in the background. All the Gaussians have the same width and are located at the same depth, but their amplitudes depend on the specific model parameter, of course. With these premises, the SIR code was modified into the so-called SIRGAUS code, which has been used, among others by Jurčák et al. (2007), Jurčák and Bellot Rubio (2008) Atmospheres with jump discontinuities Discontinuities can be treated numerically by decreasing the depth grid step or by using Eq. (25). A specific implementation of such discontinuities was first used by Louis et al. (2009) for an analysis of sunspot light bridges. Like SIRGAUS, it is based on a modification of the SIR code to take this particular scenario into account. In it, two magnetic atmospheres coexist in the resolution pixel: a background atmosphere whereby B and v LOS are constant with depth, and another magnetic atmosphere where those quantities have a Heaviside-like discontinuity. This code (called SIRJUMP) has also been used in practice, e.g., by Martínez González et al. 24 Since the ultimate goal of inversions is the bona fide reproduction of observed profiles, an analysis of the properties of Stokes spectra as functions of the wavelength is in order. Such an analysis should be aimed at finding the most conspicuous characteristics of the profiles in order for these characteristics to be the best reproduced among all the features. In other words, if, for instance, a given Stokes I(λ) profile shows only small deviations from a Gaussian, we should aim to obtain the Gaussian that best simulates the profile and identify the model parameters responsible for this bulk behavior. In some cases we may be satisfied just with this "coarse", or not very detailed, description and leave small deviations or nuances to further, in-depth analysis that might even be carried out separately. As we are going to see, this approximation of incremental complexity for the profiles is well in line with the successive approximations we have described for the model atmospheres in Section 3. The Stokes Q, U , and V profiles and Stokes I in line depression; that is, as functions of x ≡ λ − λ 0 (where λ 0 is the central wavelength of the line), can be decomposed as sums of even and odd functions of x, as any other function defined over IR. 12 Specifically, if we call S(x) any one of the profiles, then where By construction, S + is even and S − is odd. 13 This parity property is very interesting because, as we have seen in former Sections, the Stokes profiles of any line formed in the absence of velocity gradients have definite symmetry (parity) properties. Since asymmetries in regular profiles are relatively small, that is, the profiles usually display a predominant parity character (even for Stokes I, Q, and U , and odd for Stokes V ), a sum of even and odd profiles may give account of the observed spectra as if the opposite parity component was indeed a perturbation related to velocity gradients. This can explain the success of ME inversion codes for fitting many observations (cf. Westendorp Plaza et al., 1998;Orozco Suárez et al., 2010). The ME atmosphere accounts for the main bulk of the observed Stokes profiles. In Fig. 8 we plot the differences among the Stokes profiles of the Fe i line at 617.3 nm as synthesized in two model atmospheres. Both have the HSRA (Gingerich et al., 1971) stratification of temperature with B = 1500 G and γ = ϕ = 30 • . One of the models has a constant v LOS = 1.87 km s −1 and the other a small gradient from v LOS = 2 km s −1 at the bottom of the atmosphere through 1.75 km s −1 at the top. Both have ξ mac = 1 km s −1 and have been convolved with the IMaX PSF. Note that these differential profiles display almost the opposite parity character to their corresponding Stokes profiles. A description with only S + or S − can thus provide a first approach analysis to a large number of observations. Indeed, S + should be good for I, Q, and U , and S − for V as the differences are smaller than our "nominal" noise of 10 −3 I c . This, of course, cannot always be the case. Very peculiar Stokes profiles are often observed as our polarization accuracy increases. For instance, Sigwarth et al. (1999) first reported the observation of one-lobed V profiles that were later studied in detail by Grossmann-Doerth et al. (2000) and Sigwarth (2001). Most of these profiles are found in the internetwork (e.g., Sainz Dalda et al., 2012).
A different description of the Stokes profiles as functions of wavelength was proposed by Rees et al. (2000) who suggested that they can be described as sums of given principal components or eigenprofiles. If those eigenprofiles are contained in a database and are properly selected, they can increasingly give account of the profile shapes just by increasing the number of principal components in the expansion. An example of such eigenprofiles is given in Fig. 9. By adding these components properly weighted, the corresponding Stokes profiles are synthesized. This is the basis for all the PCA inversion techniques presented so far and the concept is fairly simple.
A similar approach to that of PCA was proposed by Del Toro Iniesta and López Ariste (2003), based on the fact that Stokes I d , Q, U , and V belong to IL 2 , the space of square integrable functions over IR. Since IL 2 is a Hilbert space with a well defined scalar product, an exact, infinite expansion of the profiles is possible in terms of any of the several bases of the space. Among those basis systems, Del Toro Iniesta and López Ariste selected the family of Hermite functions, h n (x), because of the similarity between the shapes of the first few elements of the family and the Stokes profiles (see Figure 10). Somehow, the Hermite functions (see the aforementioned paper for a definition) provide a suitable basis for approximating the observed profiles with finite expansions of a few terms. Apart from their possible use in inversion codes that has not been investigated so far, the expansion of Stokes profiles in terms of Hermite functions has been used by Toussaint et al. (2012) for compressing observed Stokes profiles and by Harker (2012) for an automatic solar active region detection. The first authors, after expansion of the Stokes profiles, only keep the coefficients compressed with a conventional algorithm. This way they reduce the storage space by a factor 20 while keeping most of the information virtually noise free. The latter author discriminates the different active regions after looking at the complexity of the emerging Stokes profiles as described by their Hermite-function expansion coefficients.

A synthesis approach
As described in the introduction, an approximate knowledge as to how the Stokes profiles react to the various model parameters is advisable as it helps to select the adequate observables as "orthogonal" as possible. The ideal way to explore the diagnostic capabilities of Stokes profiles is by means of response functions (see Sect. 6). Tackling the problem head-on, that is, synthesizing the profiles in different model atmospheres, may help grasp basic ideas on the Stokes profile behavior, though. The idea is to study how the Stokes profiles vary when the model parameters are modified. This is the aim of this section.

Constant atmospheres
As we have been doing in the two previous sections, let us start with the easiest case of atmospheres that do not vary with optical depth and, specifically, with ME atmospheres, since their analytic solution of the RTE enables a quick numerical overview of the space of model parameters. Figure 11 shows Stokes I, Q, and V for the Fe i line at 617.3 nm with the same thermodynamic parameters used in Figs. 5 and 7. Since the linear polarization L 2 ≡ Q 2 + U 2 is rotationally invariant and ϕ is constant throughout the ME atmosphere, we are assuming to have selected the preferred reference frame where U is identically zero, so that L = Q. In practical terms we have selected ϕ = 0 • for all the profiles. The magnetic field strength is B = 300, 500, 900, and 1200 G for the four rows from top to bottom. The magnetic inclination is encoded in color: γ = 0 (dark green), 15 (purple), 30 (pink), 45 (green), 60 (blue), 75 (red), and 90 • (black). If we again assume a typical noise σ = 10 −3 I c , then the small differences in the core of Stokes I for B = 300 may not be detected and a neat distinction between γ = 0 and 15 • or γ = 90 and 75 • may hardly be reachable with Stokes Q and V at a 3σ level. Most certainly, however, there should not be any problem to distinguish between 0, 30, 60, and 90 • . Of course, the dependences of I and V on γ are significant enough when the field is stronger. One should not be restricted to longitudinal components even for this small field strength, and the situation improves further for lower noise levels. This is a well-known issue: in polarimetric observations we are photon starved, indeed as much as night-time astronomers may be for detecting very faint objects.  Figure 11: Stokes I, Q, and V as functions of the magnetic field inclination: dark green is for γ = 0, purple for 15, pink for 30, green for 45, blue for 60, red for 75, and black for 90 • . The magnetic field strength is different for the various rows: from top to bottom, B = 300, 500, 900, and 1200 G. The magnetic azimuth is identically zero.
The alternative way to gauge the sensitivity of the Stokes profiles to B and γ by synthesizing the profiles is shown in Figure 12. The rows correspond to different inclinations: γ = 15, 30, 60, and 75 • from top to bottom. The magnetic field strength is this time encoded in colors: B = 1200 (black), 900 (red), 500 (blue), 300 (green), and 200 G (yellow). This complementary view shows that the dependence on B is in fact stronger than on γ. Therefore, properly sampled Stokes profiles with enough polarimetric accuracy should be able to provide the required information to infer the magnetic field strength and inclination separately for most of the strength spectrum. Weaker fields will have bigger uncertainties for sure, but they should not imply a theoretical inability. As a matter of fact, the weaker the fields we want to explore, the smaller the noise we need in our observations, but this is somehow obvious.

Depth-dependent atmospheres
In the solar atmospheres, physical quantities do vary with depth. Acknowledging such variations almost always implies resorting to numerical solutions of the transfer equation. That was first done by Beckers (1969a,b). Numerical results by Staude (1969Staude ( , 1970 and by Wittmann (1971, see Fig. 13) soon appeared and numerical codes were described (e.g. Wittmann, 1974, Landi Degl'Innocenti, 1976. Those first numerical codes capable of synthesizing the Stokes profiles were based on the fourth-order Runge-Kutta algorithm that is very accurate at the price of being very computationally expensive. A generalization of the method by Feautrier (1964) to polarized light was proposed by Auer et al. (1977) and later modified by Rees and Murphy (1987) in order to take magnetooptical effects into account. A fast solution of the RTE, after being reformulated as an integral Volterra equation (first suggested by Staude, 1969), was formulated by Rees et al. (1989) with their so-called DELO method. An improvement in accuracy and computational speed was obtained by Bellot Rubio et al. (1998) with an Hermitian method based on developing the Stokes vector as a fourth-order polynomial with depth. In Fig. 14 we show an example of a synthesis of the same line as in Fig. 13, where clear asymmetries in wavelength in the Stokes profiles can be seen. According to Sections 3.2 and 4, such asymmetries have been produced by the variation with depth of physical quantities.
Learning how the various profile features of the four Stokes parameters depend on the many model parameters is certainly difficult and cannot be summarized in this paper. Experience, however, can train a researcher to be able to deduce -many times after a quick glance (the art )a specific stronger v LOS or B here or there in the atmosphere. The situation is therefore much more complicated than for the ME case and one should rely upon inversions.

MHD simulations
The advent of magnetohydrodynamic (MHD) simulations such as those by Vögler (2003), Vögler et al. (2005), and Rempel (2012) has opened a new window on the exploration of the solar photosphere. They have enabled calculations that may help to envision what is expected from observations and to interpret them. The simulations also provide predictions that can be confronted with them. The enrichment has been remarkable because the realistic atmospheres resulting from the simulations have physical quantities varying along the optical path without any a priori assumptions and may be closer to the actual Sun than other simplified atmospheres. MHD simulations have been used to test the reliability of inversion techniques de la Cruz Rodríguez et al., 2012). In the first of these works, a confirmation of the predictions by Sánchez  was found: if your inference technique assumes magnetic fields and velocities constant with depth and you use it on data coming from an atmosphere where these quantities are depth dependent, the result is just the average of the actual stratification weighted with the generalized response function to perturbations of that quantity. In the second of these papers, the non-LTE inversion code called NICOLE is tested. Here we report on preliminary results by Harker et al. (2016) to illustrate the role of simulations as a tool to determine the optimum wavelength sample of Stokes profiles. This is a particularly interesting topic that is very relevant to the observational (and hence interpretational) work. Are the available samples enough for capturing all the information encoded in the Stokes profiles? What is the optimum sampling one should use with a new instrument under development depending on the goals such an instrument aims to fulfill? Figure 15 shows the Stokes I (in depression), Q, U , and V profiles of the Fe i line at 630.25 nm across a slit over a sunspot simulation by Rempel (2012) (left column panels) and their corresponding power spectra (right column panels). The simulation contains the transition from the quiet Sun (at both sides of the X dimension) through the penumbra and the umbra of a sunspot.
The power spectra of Stokes V , Q, and U are wider than that of Stokes I as a natural consequence of their shapes. Therefore, a cut-off frequency is better found in the polarization profiles. In this example, Shanon's critical sampling interval is around 1.25 pm/pixel with the remarkable fact that no convolution with an instrumental PSF has been applied to the profiles. This value is coarser than that provided by several ground-based spectrographs with resolutions about R ≃ 10 6 that would be considered too fine for the required diagnostics. As soon as the profiles are observed by an instrument with a finite width PSF, the Nyquist frequency will shrink to smaller values. Hence, the critical sampling will be coarser. These kinds of calculations can therefore help in designing new instruments and, as far as this paper is concerned, in deciding the adequate spectral sampling for any synthesis or inversion code: if the sampling is too fine, we are wasting computational time; if the sampling is too coarse, we are neglecting available information.

Response functions
The needle in an old-fashioned ammeter is useful as far as it moves when a current circulates through the electric circuit. If, for instance, the needle does not move when a given very weak or very strong current passes through, then the ammeter is useless or insensitive to those intensities. Keeping this analogy, our observables, the Stokes profiles are useful for inferring given physical quantities as long as they change when those physical quantities vary. Of course, to be detectable, the change should be larger than the noise. Therefore, the correct question to ask of a given spectropolarimetric proxy whether it senses, for example, T , B, or v LOS is how much it modifies when T , B, or v LOS change. The preceding section has been an attempt in that direction: we have been changing the various atmospheric quantities and checking the modifications in the Stokes profiles. We have proceeded in the direct way, that is, through the solution of the differential RTE. This direct approach is not very useful in practice, though, as we already recognized in Section 5.2. Which quantity is to be modified first, at which optical depths, and by how much? If the problem is the simplest we talked about in the introduction (that of measuring velocities from the line core wavelength), then there may be some room for the direct approach. If not, the diagnostic capabilities of the Stokes profiles have to be further explored with the final goal of proceeding the inverse way, that is, of solving the integral equation known as the formal solution of the RTE (Eq. 8).
Here, we have a difficult problem where the observables depend nonlinearly on the unknowns. The nonlinear character is clear: the observables -on the left-hand side of Eq. (8)-are equal to an integral of the product of three terms, each depending strongly non-linearly on the physical quantities that characterize the model atmosphere (e.g., Del Toro Iniesta, 2003b). Changes in the Stokes spectrum are then very difficult to predict when modifications in the physical parameters occur. As in many other branches of physics, the diagnostic tools come out trough a linearization analysis. We can assume, for instance, that in a very special regime, when perturbations are small enough, changes occur linearly. These are the basics of linearization that, in the realm of solar physics were introduced for non-polarized light by Mein (1971, see also Canfield, 1976, and Caccin et al., 1977 through the so-called weighting functions, although the name of response functions (RFs) did not appear in the literature until the work by Beckers and Milkey (1975). Since polarization was not taken into account, those analyses were only strictly valid for isotropic media or, as far as we are concerned, for non-magnetic atmospheres. Response functions were introduced within polarized radiative transfer by the brothers Landi Degl'Innocenti and Landi Degl'Innocenti (1977) but RFs were paid little attention to until the works by Landi Degl'Innocenti and Landolfi (1982Landolfi ( , 1983, Grossmann-Doerth et al. (1988), Sánchez Almeida (1992), and Ruiz Cobo and Del Toro Iniesta (1992,1994). The latter found that the perturbations δx i exerted to the p + r atmospheric quantities (p of them varying with height and r constant) induce modifications δI(0) to the observed Stokes profile given by where Therefore, the modification of I(0) is given by a sum of terms, each related to one of the atmospheric quantities characterizing the medium. The terms are integrals over the whole atmosphere of the model atmospheric quantities weighted by the RFs. The physical meaning in Eq. (29) is straightforward: imagine that we change only a given quantity (T , B, v LOS , or any other) with a magnitude unity (i.e., 1 K, 1 G, 1 km s −1 , etc.) in the narrow surroundings of a given continuum optical depth τ 0 ; then, the subsequent modification in the emergent Stokes spectrum is just the value of the corresponding RF at that optical depth: Then, since the Stokes profiles are usually recorded normalized to some reference value (e.g., the average, -unpolarized-continuum intensity of the quiet Sun), units for RFs are inverse units of the corresponding quantity. That is, the response function to perturbations of temperature is measured in K −1 ; the response to perturbations in the magnetic field strength is measured in G −1 ; and so on. Thus, a response function can be defined as the modification that the Stokes spectrum experiences when the medium undergoes a unit perturbation of a given physical quantity at a given very narrow region in optical depth. Equation (30) tells us that these modifications build upon the variations of the propagation matrix and the source function vector with respect to the physical quantities and their evolution through the atmosphere as driven by the evolution operator. The two variations have an opposite sign. This means that they are somehow competing as one could expect. While S represents the sources of photons, K represents the sinks. Indeed we know that we do not only speak about photon removal but also about pleochroism and dispersion but, certainly, the propagation matrix role is somehow similar to that of a withdrawal. The counterbalancing between S and K is very important to understanding radiative transfer because some analyses forget it and only account for the effects of K (absorption in the non-polarized case). This was clearly pointed out and explained by Ruiz Cobo and Del Toro Iniesta (1994).
Equation (29) suggests that RFs play the role of partial derivatives of the observed Stokes profiles with respect to the atmospheric quantities once they have been discretized. This role is even more clear when we go down to the real world of a quadrature formula for that equation. Model atmospheres are usually described numerically by a grid of points that are spaced in logarithmic optical depth. Let ∆(log τ c ) be that spacing. If we call x i,j ≡ x i (τ j ) and R i,j ≡ R i (τ j ), then Eq. (29) can be written as where a j = ∆(log τ c ) ln 10 c j τ j , with c j being the quadrature coefficients. Therefore, if we include a j in the RFs, as one usually does in graphical representations, then Eq. (32) shows the Stokes spectrum modifications as linear expansions of the new variables x i,j and x k . The first term on the right-hand side corresponds to those physical quantities that vary with depth; the second stands for those that are assumed to be constant. 14 In summary, we can say that RFs are indeed partial derivatives of I(0) with respect to the (numerical) atmospheric parameters and, thus, they directly provide the sensitivities of the Stokes spectrum to perturbations of the physical conditions in the medium. Examples of these RFs are plotted in Figs. 16, 17, 18, and 19. They have been evaluated for Stokes I, Q, U , and V , respectively, of the Fe i line at 630.25 nm to perturbations of the temperature (top row panels), of the magnetic field strength (middle row panels), and of the LOS velocity (bottom row panels). The RF values are multiplied by 10 6 K −1 , 10 6 G −1 , and 10 4 (km s −1 ) −1 . The two columns correspond to two different model atmospheres. That in the left-hand columns has the temperature stratification of the HSRA model (Gingerich et al., 1971), a constant B = 2000 G, γ = 30 • , and ϕ = 60 • ; the plasma is at rest in this model. That in the right-hand columns has a 500 K cooler temperature, and a magnetic field 500 G weaker, 20 • less inclined, and an azimuth of 10 • ; v LOS = 1.58 + 0.3 log τ c (km s −1 ). Equation (32) hints at a way for calculating RFs through what could be called the brute force method. This method is a four-step procedure: 1) synthesis of the Stokes spectrum in a given model atmosphere; 2) perturbation of just one of the (numerical) atmospheric parameters by a small amount and synthesis of the spectrum in the new model atmosphere; 3) calculation of the ratio between the difference of the two spectra and the perturbation; 4) repetition of steps 2) and 3) for each optical depth, for each wavelength sample, and for the remaining Stokes parameters. This is a formidable calculation as soon as the number of free parameters is large. Fortunately, Eq. (30) provides a shortcut since the evolution operator, the propagations matrix, and the source function vector have to be calculated anyway in every synthesis of the spectrum. With only the added calculations of the derivatives, one can easily calculate RFs at the same time as the RTE is solved. This property is extremely useful for inversion codes, as we shall see in Section 7.
Equation (32) also offers an explicit explanation of the astrophysical ill conditioning we commented on in Section 1: the same modification of I(0) may be produced by perturbations of different quantities or by perturbations of a single physical quantity but at several optical depths. That is, the effects of temperature can be similar to those of the magnetic field strength or the effects of perturbing B at log τ c = −0.5 can be the same as those of perturbing B at log τ c = −3. Therefore, we cannot say that the changes δI(0) are produced by perturbations of this physical parameter or that other without considering all of them at the same time. Cross-talk among some parameters may appear and, then, the retrieval of those parameters will be less reliable (see, e.g., Section 6.2).

Properties of response functions
A glance at Figs. 16, 17, 18, and 19 readily tells us that some RFs are bigger than the others. This means that our line is more sensitive to some physical quantities than to others. However, the fact that RFs are measured in inverse units of those for their corresponding parameters makes it difficult to compare their relative sensitivity. In this regard, relative RFs shed some light. If we consider relative perturbations δx i,j /x i,j , then we can define relative RFsR i,j ≡ R i,j x i,j . Hence, R i,j speak about the response of the Stokes spectrum to relative (i.e., dimensionless) perturbations. Experience shows that relative RFs to T perturbations are the biggest at all depths and wavelengths, clearly indicating that temperature is the most important quantity in line formation. (Indeed, temperature is the physical quantity that governs the thermodynamical state of the material medium because we assume that hydrostatic equilibrium prevails throughout our model atmospheres. After this assumption, pressure, the necessary second thermodynamic variable gets automatically prescribed.) Response functions to temperature perturbations start being different from zero at the deepest layers when compared to the remaining quantities. This is because the second term in the right-hand side of Eq. (30) goes to zero as the continuum optical depth tends to infinity. This was explained by Ruiz . This physical fact implies that spectral lines tend to be insensitive at these low layers to the other physical quantities.
The Stokes profile wavelength symmetries are preserved in RFs: in the absence of velocity gradients, RFs of Stokes I, Q, and U to any perturbation are even functions of wavelength and RFs of Stokes V are odd. This means that, in fact, velocity gradients increase the diagnostic capabilities of spectral lines. In their absence, half of the profile is useless since the information they provide is exactly the same as the other half. 15 For given purposes, we can conceive constant perturbations with depth, in spite of the quantity being depth dependent. Owing to their nature, some physical quantities may be assumed constant with depth (e.g., macro-and microturbulent velocity, or any of the ME free parameters). In such cases, constant perturbations are in order. If so, then, Eq. (29) tells us that the resulting modification in the Stokes spectrum is given by the product of such a constant perturbation times the integral of the corresponding RF over the whole atmosphere. Hence, we can say that the RF to a constant perturbation, R ′ in Eq. (32), is directly the integral of the regular response function or, in numerical terms, As shown by Ruiz Cobo and Del Toro Iniesta (1994), RFs play the role of a PSF in the general theory of linear systems. Under this general theory, our system -the Stokes spectrumexperiences an input (the perturbation) and provides an output, δI(0). If the input is a Dirac delta, then the output is the corresponding value of the response function. If the input is harmonic throughout the atmosphere, then the response is the Fourier transform of the RF. Response functions are model dependent. This property is extremely important in our understanding of spectral line sensitivities and, thus, in the inversion of the RTE. Instead of being a drawback, such a model dependence helps in disentangling the effects produced by the different quantities in distinct model atmospheres. Even in fixed model atmospheres, it is very difficult to discard one quantity or the other at once, however, and most of them have to be retrieved at the same time. Once this is carried out, one can theoretically understand the meaning of measurements , Del Toro Iniesta, 2003b.
Last, but not least, the linear nature of RFs allows us to generalize them to any linear combination of Stokes profile wavelength samples. This property helps us understand what can be extracted from different proxies that are traditional in solar polarimetry but, most importantly, helps the astronomer in taking influence of the instrument into account. Since most instruments act as linear systems on light, the detected spectrum is a convolution of the actual spectrum with the instrument spectral PSF. Convolution is linear and, thus, one can easily conclude (Ruiz Cobo and Del Toro Iniesta, 1994) that the RFs of the convolved spectrum are nothing but those of the original spectrum convolved as well with the PSF. 16

Analytic response functions
When all the terms on the right-hand side of Eq. (30) can be calculated analytically (see Sect. 2.3), RFs are necessarily analytic and then we can use them to gain some physical insight into the diagnostic capabilities of the Stokes spectrum about the physical quantities that characterize the medium. This is the case of the ME approximation, where all the quantities are constant with depth. There, index j in Eq. (32) drops and (after inclusion of the coefficient into the RF) we can properly write: That is, that RFs are strict partial derivatives of the Stokes spectrum with respect to the free parameters of the problem (Orozco Suárez and Del Toro Iniesta, 2007). In Eq. (34) we have removed the τ c = 0 indicator in the emergent Stokes spectrum and, rather, we have made explicit the dependence of RFs on wavelength. We have just seen in the former subsection that these RFs are indeed integrals over τ c and, hence, the dependence on it disappears. This analytic character is particularly important in the controversial discussion about the possibility of disentangling B, from α (the filling factor) in the case that our magnetic features are not fully spatially resolved. In Fig. 20 we reproduce Fig. 3 from Del Toro Iniesta et al. (2010). It shows RFs to (constant) perturbations in those two quantities plus v LOS in a ME atmosphere. As one can clearly see, both Stokes V RFs to α and to B perturbations are almost proportional among themselves and to the Stokes V profile itself. This can easily be traced back to the expected behavior from the weak field approximation, as expressed in Equation (24). From this proportionality we should conclude that it is indeed very difficult to discern the values of B and α separately. If it were not for Stokes I, the widespread belief that only αB or αB cos γ can be retrieved would be true. However, the Stokes I RFs to α and B are neatly different from each other and this necessarily implies that we have the means of inferring the two quantities independently. Stokes Q and U also help in disentangling the field strength and the filling factor for similar reasons as soon as they are above the noise level. Among other features of RFs, the latter authors showed how the thermodynamical parameters of the ME atmosphere can have cross-talk among themselves: their RFs are fairly similar in shape, so that their effects can be misinterpreted by the inversion codes. Notably, the RFs to perturbations of B, γ, ϕ, and v LOS are markedly different from one another and with respect to those of η 0 , ∆λ D , and a. This explains the good result of ME inversion codes in accurately retrieving the magnetic and dynamic parameters while the thermodynamic parameters are sometimes wrong. Our conclusion is also consistent with, and explains, the findings by Lagg et al. (2004) who decided to leave the damping parameter fixed with minor changes in the fitted magnetic and dynamic parameters while ∆λ D and η 0 were significantly affected. Linearity can also be useful for spatially coupled inversion techniques (see Sect. 7.5.1 below).

Inversion techniques
Once we have discussed all the ingredients and assumptions, we can face the main problem in astrophysics, namely that of making theory and observations compatible. In other words, we can face the inversion problem by deriving the unknown physical quantities through comparison between observed and synthetic Stokes profiles. 17 Figure 21 describes how the problem gets complicated as compared to the mere forward problem in Figure 2. One can clearly see how a new overarching loop is present that indicates the needs for changing the model atmosphere if the synthetic Stokes spectra do not properly fit the observed ones. The problem turns out to be formidable and requires new, specific assumptions that make it tractable. In particular, some of the quantities have to be calculated from the strict NLTE conditions (see Section 7.2.3).
Even the simpler LTE problem gets complicated, and indeed becomes iterative, regardless of acknowledging the stratification in the physical quantities of the atmosphere (see Figure 22). The needs for modifying such a model atmosphere according to the deviations between observed and synthetic profiles makes a loop necessary after calculating both the synthetic spectra and their derivatives with respect to the free parameters. Fortunately, we know how to calculate these derivatives through RFs at the same time as we synthesize the Stokes spectrum with little extra computational effort.
Looking for convergence means measuring the distance between observed and synthetic profiles in the space of observables. Any inversion procedure must have a threshold below which the user can consider that convergence has been reached because the fit cannot be further improved within the current assumptions.

Topology in the space of observables
The topological problem depicted in the Introduction for the inversion problem (or any astrophysical inference) needs to be substantiated in minimizing a distance: a metric in the space of the observables. Since Stokes I d , Q, U , and V belong to IL 2 , the quadratic norm of the difference turns out to be the natural distance between any two profiles. We want to approximate two sets of profiles, so that all four Stokes parameters should be taken into account. Therefore, when in practice we deal with discrete samples, the sought-for distance can be written as where index s runs for the four Stokes parameters, we assume q wavelength samples, and N f stands for the number of degrees of freedom, that is, the difference between the number of observables (4q) and that of the free parameters (the number of elements in x, np + r; see Section 2.1 and Equation 9). χ 2 (x) is a merit function of the atmospheric quantities that measures the distance between the observed and the synthetic profiles and has to be minimized in order to achieve a good fit. Having a normalized merit function to the degrees of freedom is useful to warn the user not to use an unreasonably large number of free parameters as compared with the number of observables. In such a case, χ 2 would turn out to be always too big. The weights w s,i can be used to favor some data more than the others. For instance, one can set them to the inverse of the measurement errors. For many applications they are simply kept at unity.
We can look at χ 2 (x) as a scalar field in an (np + r)-dimensional space. Since the number of dimensions may be too large, the minimization problem may turn out to be intractable. Before going to specific techniques that make it affordable, let us consider the paths through which we can look for the minimum of the merit function. That is, we have to find the derivatives of χ 2 with respect to the atmospheric free parameters. Ruiz Cobo and Del Toro Iniesta (1992, see Ruiz Cobo and Del Toro Iniesta, 1994, and Del Toro Iniesta, 2003b showed that such derivatives are directly given by the RFs: where, for the sake of a more compact notation, index m runs from 1 to np + r (including constant and variable physical quantities), the quadrature coefficients in Eq. (32) are assumed to be included in the RFs when needed, and no distinction is made between R's and R ′ 's. The same authors also demonstrated that the second derivatives can be approximated by Regardless of the way we approach its minimum in the hyperspace of parameters, that χ 2 is the natural metric is reinforced by the fact that other metrics have been tried (e.g. Lagg et al., 2004) that have finally converged to almost the same formulation as in Eq. (35) (Lagg et al., 2007).

Levenberg-Marquardt based inversions
The process of profile fitting is nothing but the successive (and iterative) approximation of synthetic Stokes profiles until they reach a minimum distance to the observed ones. Hence, an initial guess model atmosphere is needed to start the procedure.
Step by step, the model will be modified, so that the resulting synthetic Stokes profiles will approach more and more the observations. When we are close enough to the χ 2 minimum, an approximate, parabolic motion may be useful: where the elements of the gradient are given by Eq. (36) and H ′ is one half of the Hessian matrix, whose elements are given by Eq. (37), that is, H ′ m,k = ∂ 2 χ 2 /∂x m ∂x k . In Eq. (38) a scalar product is understood between a transposed (row) vector and a regular (column) vector. When we are very near the minimum, it is clear that the second term in the right-hand side of Eq. (38) should be zero, and this is done in the Levenberg-Marquardt (LM) algorithm (e.g., Press et al., 1986) by requiring that where the new matrix H is defined by where λ is an ad-hoc parameter that helps tuning the algorithm for it to work as if the approximation is almost first order (λ is large) or fully second order (when λ is small). λ is changed in every step in the iteration, depending on how far or close we are to the minimum as indicated by the variation of χ 2 . At the end of the procedure we will most likely not find the true minimum but, hopefully, will be close enough to neglect the gradient term in Equation (38). In such a case we can write The good news about this relationship is that, since the Hessian matrix is made up of RFs, one can finally obtain an expression for the inversion uncertainties in the physical quantities that are functions of the RFs (see Del Toro Iniesta, 2003b).
Certainly, the larger the RFs, the smaller the uncertainties.

Problems in practice
Nodes and singular value decomposition With an LM algorithm, inversion of the RTE reduces in summary to solving Eq. (39), which implies the inversion of the modified Hessian matrix. One can certainly not expect the same practical problems when H is built for an ME inversion or for a more general assumption where physical quantities vary with depth. Already in the ME case, H has dimensions 9 × 9 or 10 × 10 (if the filling factor is assumed to be different from unity). Inverting a 10 × 10 matrix is not difficult but, in the more general case, when the atmosphere is parameterized with a depth grid of 20 or 30 points, the Hessian may have several tens or even hundreds of elements in both dimensions. Inverting such matrices is by no means an easy numerical task. A second problem can appear in practice as H may be a quasi-singular (numerically singular) matrix because of the different sensitivities of the Stokes parameters to the various physical quantities that may vary even by orders of magnitude. One particular Stokes parameter of one specific spectral line may not be sensitive to a given physical quantity at given depths in the atmosphere. We already know, for instance, that, about or below log τ c = 0, only temperature leaves its fingerprints on the spectrum: the profiles are insensitive to the other quantities (see Section 6.1). Hence, the corresponding matrix elements in H will be close to zero, so that they hamper the Hessian matrix inversion. Here we report on the way SIR deals with these two problems. Other inversion techniques (e.g., LILIA, NICOLE, MILOS) apply similar procedures, although no much explicit information is available. The first problem can be circumvented by using several iteration cycles in each of which the number of free parameters is fixed and increased successively from cycle to cycle. The inversion of quasi-singular matrices is usually carried out through the singular value decomposition technique (SVD; e.g., Press et al., 1986).
Nodes and equivalent response functions Imagine that we only have one physical quantity to deal with in the inversion. Then, the number of free parameters is n, the number of depth grid points. Our Hessian is an n × n matrix. A practical way out of this involved numerical problem is found (Ruiz Cobo and Del Toro Iniesta, 1992) by assuming that all depth grid point perturbations are not free but bound by some interpolation formula. For example, we can use polynomial splines. This assumption allows to consider any number n ′ of free parameters from 1 through n. If such a number is 1, we assume we are applying a constant perturbation, whatever the original stratification is. The perturbation will be linear if the number is 2, parabolic if it is 3, and so on. As explained in Del Toro Iniesta (2003b), the use of nodes requires the evaluation of equivalent RFs at the nodes,R's, in order to take information from the whole atmosphere into account. With this technique, the equivalent of Eq. (32) in practice becomes where y m is a new notation for the free parameters at the nodes. 18 Constant and depth-varying physical quantities are treated the same in Eq. (43), and the quadrature coefficients are assumed to be included in the RF definitions.
Different sensitivities to the various free parameters Since, by construction, the Hessian matrix is real and symmetric, its inversion is done through diagonalization. The quasi-singularity of the Hessian matrix shows up as some of the diagonal elements, γ k , being too close to zero to be inverted with accuracy. This could be for two reasons. The first one is that, within the free parameters belonging to the same physical quantity, some depths are necessarily more important than others: as we have seen in Sect. 6, RFs tend to zero at some depths. This problem is numerically solved by setting 1/γ k to zero, whenever γ k is considered too small (under a given threshold). By doing so, we are not (over)correcting a parameter that has no relevance at this time. 19 The second reason for singularity is that some physical quantities are more important than others. Therefore, the sensitivities of Stokes profiles to perturbations of those quantities can be larger (even by an order of magnitude) than perturbations to less significant quantities. This problem is overcome by using relative instead of absolute RFs as we explained in Section 6.1. Nevertheless, and in order to make sure that all physical quantities are considered during each inversion cycle, the zeroing of the less significant diagonal elements is applied separately to each physical quantity. Initialization The lack of uniqueness we were discussing in the Introduction may be revealed in a dependence on the initialization parameters. The community has realized this fact for a long time and codes such as SIR or HeLIx have been explicitly tested and showed robust against different initializations (Ruiz Cobo and Del Toro Iniesta, 1992;Lagg et al., 2004). Such robustness can nonetheless depend on the specific Stokes profiles and model atmosphere. Therefore, an advisable practice when doubts arise is to use several different initial guesses for estimating the uncertainties in the results for each physical quantity. Several attempts have been performed as well for finding an ideal initialization guess, including specific genetic algorithm procedures only for getting the initial guess (Skumanich, private communication). In our opinion, having initializations almost as complicated as the inversion itself does not make much sense. While preparing a given application, we discovered an outstanding, very economical way of making optimum initial guesses. Such an initialization is very much in the line we have been supporting throughout the paper, namely, the usefulness of a step-by-step approach. Using the classical center-of-gravity (Rees and Semel, 1979) and weak-field approximations of the Appendix, we have obtained a remarkable acceleration in the convergence, as shown in Figure 23. 20 According to Uitenbroek (2003), the center of gravity technique has the remarkable property of being quite insensitive to the spectral resolution of the data. We can add that the results in Fig. 23, which have been obtained from profiles sampled at 30 wavelengths, are indeed fairly similar to those for six wavelength samples like those foreseen for the SO/PHI instrument (SO/PHI is the acronym for the Polarimetric and Helioseismic Imager for the ESA's Solar Orbiter mission; see Solanki et al., 2015).
Consistency among different ME implementations Milne-Eddington inversion techniques have become widely used to infer the vector magnetic field and the LOS velocity of the solar plasma. The physical interpretation of ME results was first investigated by Westendorp Plaza et al. (1997Plaza et al. ( , 1998 while comparing them to those obtained with SIR. A check of the theoretically predicted result and that obtained in practice was carried out in the latter paper for the HAO-ASP code Lites et al., 1988). Predictions by Sánchez  were confirmed: 1) measurements are essentially the result of averaging the actual parameter stratification with the corresponding generalized response function; 2) the so-called ME thermodynamic parameters had little correlation with the actual (quickly varying with optical depth) thermodynamic parameters. Further practice has usually shown that these "thermodynamic" parameters may not be very consistent among different runs for the same spectral line, while B and v LOS are fairly accurate. This finding has been physically explained by Orozco Suárez and Del Toro Iniesta (2007) who explored the shapes of ME response functions: RFs to perturbations in η 0 , ∆λ D , and a are fairly similar among themselves and, hence, cross-talk may appear between every two parameters; this is not the case, however, with RFs to perturbations in B and v LOS , which are neatly different.
Although the consistency among different versions of the ME inversion is therefore guaranteed through physical analysis, the various implementations may have different numerical approximations and the technique can even be different (e.g., LM, genetic algorithms, PCA, etc.). Motivated by this fact, Borrero et al. (2014) have checked the consistency among the HAO-ASP, HeLIx, and VFISV codes. They have found a positive confirmation of the previous results by using MHD simulations as a test bench instead of ME Stokes profiles.

Automatic selection of nodes
In several specific problems, the optimum choice for parameterizing the atmosphere is actually an art, i.e., a skill that arises after the continuous exercise of intuitive skills. Codes like SIR help the user by enabling several node choices for each parameter. Some different choices can yield similar fits; others can make it impossible to reach a convergent solution. As we have commented on in several places of the present paper, a generally good approach is lex parsimoniae: between two solutions reaching a similar fit quality, that with fewest nodes should be selected. However, in practice, one cannot repeat the inversion several times in order to choose the optimum number of nodes for each parameter.
The current version of SIR includes an algorithm that automatically selects such a number of nodes for every parameter in each iteration. The algorithm is based on the quest for the roots or zeros of the partial derivative of χ 2 with respect to each parameter, as written in Eq. (36). Let a be one of the atmospheric quantities varying with optical depth and a p its value at τ p ; that is, a p is one of the elements in the model atmosphere of Eq. (9). Let us call d ap ≡ (∂χ 2 /∂a p ). Let us also suppose that we are only dealing with the intensity profile of one spectral line, and that, at a given iterative step, I obs > I syn for all wavelengths. If R p,0 (λ i ) is positive for all wavelengths and all optical depths, it is then clear that d ap will also be positive at all depths. To get a better fit, then, we will need to increase a everywhere and just one node might be enough. Following this reasoning, it is easy to conclude that the number of nodes for a given physical quantity should be related to the number of times that the derivative d a changes its sign over the optical depth range. Obviously, as the derivative depends on the observational data, it is influenced by noise and, consequently, spurious zeros should be eliminated. Consequently, the algorithm determines the number of nodes after looking for positive relative maxima, and negative relative minima, larger in absolute value than a given threshold. An example of the behavior of this automatic selection feature in SIR is shown in Section 8.1.
This automatic selection of the number of nodes can be considered a quantitative implementation of the principle of Occam's razor. Others are indeed possible. An alternative was presented by Asensio Ramos (2006). This author uses the minimum description length principle to effectively find the optimum number of expansion coefficients in PCA-based inversion techniques or the optimum number of nodes for the various atmospheric parameters in the SIR code. This problem is also addressed by Asensio Ramos et al. (2007b), who estimated the intrinsic dimensionality of spectropolarimetric data based on nearest neighbor considerations and applying the principle of maximum likelihood.

A non-LTE inversion technique
The only available non-LTE inversion technique so far is called NICOLE by Socas-Navarro et al. (2015), which is an evolution of the IAC-NLTE code by Socas-Navarro et al. (2000). In turn, the latter was adapted from a previous one for non-polarized problems by Socas-Navarro et al. (1998). Since even the minute details of the code are extensively described in those papers, let us simply stress here the main assumptions underlying the code for those potential users to know the validity framework of the results.
Regarding the minimization technique, NICOLE employs an LM algorithm that proceeds very similarly to SIR by using response functions. Since RFs cannot strictly be calculated in this specific non-linear, non-local problem, the fixed departure coefficients (FDC) approximation is used to deal with the derivatives of the LTE atomic level populations once the β's in Eq. (12) are fixed from a previous calculation (Socas-Navarro et al., 1998). Although the approximation is not exact and, indeed, the authors show deviations from the correct values, FDC is good enough for the purposes of getting RFs that pave the way for the code to find the minimum distance between the observed and the synthetic profiles. The second important approximation in NICOLE is the fieldfree approximation (Rees, 1969), as we already mentioned in Section 2.3. It consists in obtaining the departure coefficients from an unpolarized, non-LTE code and uses them in a formal solution of the RTE. This way, the β's are decoupled from the magnetic field. According to the authors, this approximation is valid because the actual level populations are governed by strong UV (weakly split) lines and those lines with large Zeeman splittings are weak enough not to have a significant influence on the statistical equilibrium equations.
NICOLE only deals with polarization induced by the Zeeman effect. Hence, any polarization produced by scattering or depolarization through the Hanle effect are not taken into account. Other assumptions, such as the validity of complete frequency redistribution, may have implications for applications in specific spectral lines. It has recently been used in the analysis of the Ca ii line at 854.2 nm (de la Cruz Rodríguez et al., 2015).

Database-search inversions
The foundations of Principal Component Analysis inversion codes have already been explained in Sect. 4. The expansion of Stokes profiles as linear combinations of eigenprofiles is at the root of the technique. A set of synthetic Stokes profiles of a given spectral line, obtained with a large number of model atmospheres, is used as a training set to decompose each synthetic and observed profile into a sum of a small number of such eigenprofiles. The inversion topological problem, then, is reduced to a search in the low-dimensional space generated by the eigenprofiles or, more specifically, in the space of the expansion coefficients. The technique has proved to be efficient for quick inversions of the observations and looks very promising as a classification tool for profiles that can later be examined with more detailed techniques. We say so because no PCA code so far has been envisaged to go further than an ME or a slab atmosphere. This is natural in a way. One can build a database where a few constant parameters can get values within given ranges but the mere construction of such a database would be a formidable problem if variations with optical depth of the atmospheric physical quantities are considered. The technique, therefore, is unable to deal with gradients and the like, which -we know-populate a large fraction of the magnetic Sun. Nevertheless, as a first-order approach it is extremely useful everywhere and is eventually the only available tool to explore the behavior of some solar features (e.g., López Ariste and Casini, 2003;Casini et al., 2005).
According to Skumanich and López Ariste (2002), the leading orders of the PCA expansion may have a direct (approximate) interpretation in terms of values for the physical quantities, specifically for the LOS velocity and the vector magnetic field. This result is in line with our discussion in Sects. 3 and 4 about successive approximations in the complexity of both the model atmospheres and the profiles. Since the profile database has to be created for each spectral line or group of lines, PCA is very well suited to analyze those spectral lines whose radiative transfer is particularly complicated, either because physical mechanisms other than the Zeeman effect are involved in their formation, or because the scenario is morphologically difficult, as in prominences, spicules, and other chromospheric structures (López Ariste and Casini, 2002Casini et al., 2005Casini et al., , 2009. Extensions of the technique to stellar problems have also been proposed already Ramírez Vélez et al., 2006;Martínez González et al., 2008;Paletou, 2012;Paletou et al., 2015).
A particularly interesting feature of the PCA technique is that once the observed Stokes profiles are expanded in terms of the eigenprofiles they become less noisy. This can be helpful for several applications (e.g., Martínez González et al., 2008;Ruiz Cobo and Asensio Ramos, 2013).

Artificial neural network inversions
Artificial neural networks (ANNs) are systems through which a multidimensional input is translated into a multidimensional output by means of a non-linear mapping. The mapping (or, better, the parameters for the mapping) is (are) obtained by a previous process called training where the system is presented with inputs whose target outputs are already known. The process of training can be long and tedious but, once it is finished, the ANN can deal with new inputs with an extremely quick performance. Within the realm of solar physics, only multi-layer perceptrons have been proposed Staude, 2001, Socas-Navarro, 2003). In these specific ANNs, the input, composed of N neurons, is sequentially -layer by layer-transformed into an output of N neurons as well, of which a subset are the M elements of the target. Following the notation by Socas-Navarro (2005), the propagation rule between the input (layer 0) and the output (layer L) is given by where Y l n represents the contents of neuron n in layer l, W l n,j stands for the synaptic weight connecting that neuron with neuron j in layer l − 1, and β l n is a bias level. One or more layers may have a non-linear activation function f l which depends on the specific implementation. In fact, the two above mentioned papers use a different f .
For ANNs, the topological problem is dealt with during the training process, while the W 's, the β's, and the parameters defining f are determined. It reduces to (Carroll and Staude, 2001) minimizing a quadratic distance similar to: where P is the total number of training input-target vector pairs and T i k is the kth target value for the ith training pair.
Artificial neural networks have been used very seldom with actual data. In fact, as far as we know, only the two applications by Socas-Navarro (2005) and by Carroll and Kopf (2008) have been published so far.

Genetic algorithm inversions
A general description of a genetic algorithm (GA), along with an implementation of the so-called PIKAIA code, was given by Charbonneau (1995). When compared to simple steepest ascent or descent techniques, these genetic algorithms are an alternative to, e.g., LM. The former can easily be stuck in local minima while GA and LM look and (eventually) find global minima of the multi-variable merit (or fitness) function. Some authors claim (e.g. Lagg et al., 2004) that GA techniques are more robust in finding global minima than LM, but no direct comparison is known to the authors of this review.
Devised for the specific problem of exploiting the chromospheric diagnostic capabilities of the He i multiplet at 1083 nm, Lagg et al. (2004) presented the so-called HeLIx code. (HeLIx is an acronym for Helium Line Information eXtraction.) The code is a direct adaptation of the PIKAIA routine to the He i multiplet formation. The line formation problem includes both the Zeeman and Hanle effects, and the presence of two blending photospheric lines of Si i and Ca i. Therefore, much care has to be taken with the analyzed wavelengths (some have to be weighted to zero) and, above all, with the complex, forward radiative transfer problem. The latter is dealt with for the photospheric Si i line by using the synthesis part of the SPINOR code (although no results from it are reported). The non-LTE effects in the He i triplet are neglected because the line is mostly optically thin. A simple ME atmosphere is assumed instead. The Hanle effect treatment is based on the oscillator model by Trujillo Bueno et al. (2002). The code later evolved (now it is called HeLIx + ) to include the incomplete Paschen-Back effect (Sasso et al., 2006) and to finally incorporate all the properties of the so-called constant property slab model by Trujillo Bueno et al. (2005Bueno et al. ( , 2002 through the addition of the forward synthesis code by Landi  with extensions by Merenda (2008). Now, the code shares the synthesis calculation module with HAZEL (Asensio  and carrying out a direct comparison between the two codes with both numerical and actual observations would be a very interesting exercise, useful for the whole community. This would bring a gauge of pros and cons of GA versus LM algorithms. As a general rule of thumb we can say that genetic algorithm inversions become feasible methods whenever the evaluation of the merit function is extremely fast because this king of algorithms require the evaluation of the merit function a thousand or a million times.

Bayesian inversions
An alternative technique to the inversion problem that adds some extra statistical information on the results, namely confidence levels on the free parameters, has been proposed by Asensio Ramos et al. (2007a), based on Bayes' theorem. According to that theorem, once the posterior distribution p(x|I obs ) is known, the position of its maximum indicates the most probable (a.k.a. optimum) combination of parameters that best fits the observations I obs . The posterior (probability) distribution represents how much we know of the parameters once the observational data set is taken into account. As the reader may have already imagined, the Bayesian inversion is nothing but a maximization of p(x|I obs ) instead of a minimization of the χ 2 merit function. We are going to see that, indeed, the two optimization problems collapse to exactly the same result in given cases where the a priori knowledge of the problem inserted in the calculations is the simplest (and -probably-the safest). Let us explain a little what we mean by this a priori knowledge.
According to the theorem, the posterior distribution is proportional to the product of a prior distribution p(x) and the likelihood distribution, p(I obs |x): The likelihood distribution measures the probability that a given model atmosphere x can produce synthetic Stokes profiles I syn that fit the observed ones, I obs . If the noise distributions are normal and typically independent of wavelength as it is usual to assume, then the likelihood is defined as (Mackay, 2003) p where χ 2 (x) is given in Equation (35). Imagine now, for a moment, that the prior is identically unity, p(x) = constant (within a reasonable range), which corresponds to a case where no a priori assumptions are made about the model parameters. (All possibilities are equally probable.) In such a case, Eq. (46) becomes p(x|I obs ) ∝ e − 1 2 χ 2 (x) and indicates that maximizing the posterior distribution is exactly the same as minimizing χ 2 . No matter what the optimization algorithm used for the inversion, introducing Eq. (35) into Eq.
(48), we have the way for estimating confidence levels as given by the (multidimensional) posterior probability distribution. Two-dimensional cuts of p(x|I obs ) allow one to explore the possible degeneracies or cross-talks among each pair of model physical quantities. As a matter of fact, and according to our discussions in Sect. 6, response functions and uncertainties derived from Eq. (42) provide qualitatively similar confidence levels although Bayes' theorem supplies a more graphical approach (see figures in Asensio Ramos et al., 2007a). The only difficulty is then properly sampling the hyperspace of parameters (see below).
The prior distribution contains the information we may have of the model parameters without taking the observations into account. If all the model physical quantities are statistically independent, then Unless other physical information is available, the typical assumptions one can make on the free parameters are the range of reliable values for each of them. Thus, a useful model for the prior distribution can be given by where H(x, a, b) is the typical top-hat function Establishing a prior, therefore, is analogous to the assumptions made by SIR on the (spline) smooth variations of the physical quantities along the atmosphere. The useful feature of p(x) is that you can even consider correlations between the quantities and model them accordingly. This is in our opinion, however, a risky exercise because over fanciful correlations can be conceived that turn into an even more involved interpretation of the results. One has to make sure that the specific conditions of the problem enable this or that a priori assumption on parameter cross-talk. In summary, if either p(x) = constant or is given by Eqs. (49), (50), and (51), the optimization problem is the same as that described in Sect. 7.1 and, in principle, the LM algorithm could be used as well. The missing ingredient is the sampling of the free parameter hyperspace. An alternative method is then in order. Sampling the parameter space means repeating the synthesis of Stokes profiles many, many times. Typically, one needs of the order of 10 np+r samples of the posterior distribution. When the number of free parameters is high, such a brute-force method becomes impracticable. Asensio Ramos et al. (2009) propose using a "not-so-brute-force", Markov chain Monte Carlo method, where marginalized distributions of parameters can be obtained. The educated successive sampling grows linearly with the number of free parameters instead of exponentially. The decrease in computational cost has allowed the authors to deal both with ME atmospheres and with general LTE atmospheres where the physical quantities vary with depth (Asensio Ramos et al., 2012).

Inversions accounting for spatial degradation
A significant step forward has been adopted by three different techniques after acknowledging the spatial effects of non-ideal instruments (van Noort, 2012;Ruiz Cobo and Asensio Ramos, 2013;Asensio Ramos and de la Cruz Rodríguez, 2015). While the spectral PSF of the instruments was soon incorporated into the inversion codes, 21 the spatial blurring could not be satisfactorily dealt with until these works. Note that, in spectropolarimetry, an extended spatial PSF not only degrades the quality and contrast of images but also introduces a spurious polarization signal that can be misinterpreted as magnetic fields and LOS velocities. Several attempts were proposed for mitigating (or circumventing) this spatial contamination, such as using a non-polarized, global quiet-Sun average (Skumanich and Lites, 1987) or a non-polarized, local (1 ′′ neighborhood) average (Orozco Suárez et al., 2007a). In these examples, the magnetic structure is assumed to contribute with a filling factor α. None of these can be considered fully consistent but they do provide an improvement in robustness and reliability of the results. So-called spatially-coupled inversions (van Noort, 2012), regularized deconvolution inversions (Ruiz Cobo and Asensio Ramos, 2013), and sparse inversions (Asensio  attack the problem directly although through different means. The first technique uses the SPINOR code and the second employs SIR; a combination of the fast iterative shrinkage-thresholding algorithm (Beck and Teboulle, 2009) and the restarting scheme by O'Donoghue and Candès (2015) is chosen for the third algorithm.

Spatially-coupled inversions
From a formal point of view, the gradient of the merit function and the Hessian matrix were very helpful in explaining the second order approximation that is behind the Levenberg-Marquardt algorithm. Instead of using ∇χ 2 , let us think of the Jacobian matrix J of the system, which is made up of the derivatives of all the data points (all wavelength samples of the four Stokes profiles) with respect to the free parameters. That is, the elements of J are just the individual terms in the summation of Equation (36). It is then clear that the approximation in Eq. (37) is equivalent to saying that the Hessian matrix H ′ ≃ J T J and that H = H ′ − diag H ′ . Consider now the inversion of the RTE for a whole spectropolarimetric image of n × m pixels individually (the uncoupled case). We can build a big (block-diagonal) new Jacobian J with the ensemble of J k,l of all the individual pixels: The new (big) H ′ readily becomes block diagonal, with the blocks being the H ′ i,j of all the individual pixels: The inverse of this matrix (or that of matrix H) is easy to obtain by individually inverting each of its block components. Let us address now the spatially coupled inversion problem, where the effects of an assumed uniform PSF ϕ(x, y) across the image are taken into account. The Jacobian can now be written as ϕ n−1,m−2 J 1,1 ϕ n−1,m−3 J 1,2 · · · ϕ 0,0 J n,m−1 ϕ 0,−1 J n,m ϕ n−1,m−1 J 1,1 ϕ n−1,m−2 J 1,2 · · · ϕ 0,1 J n,m−1 ϕ 0,0 J n,m which is no longer diagonal. Nevertheless, since the influence of the PSF should be relatively short and not involve all the image points, the resulting J has a significant number of zero elements (it is sparse). If we call Y ≡ ϕ * ϕ the autocorrelation function of the PSF, it can be shown that (van Noort, 2012) Inverting matrix H ′ is beyond our reach. However, the inversion of H is affordable -at least approximately-because the linear system is sparse, although the number crunching problem is formidable. The authors propose strategies for the approximate H ′ and recognize that human intervention (the artistic part) is in the end more needed in these coupled inversions than in regular uncoupled ones, as expected.
Although the improvement with respect to former techniques is clear and the procedure is opening a new avenue for physical inferences in the solar photosphere (see Figure 24), a few unsatisfactory oscillations appear here and there in the application to actual observations. Such oscillations show up a caveat of the technique: the possible amplification of high frequencies and, hence, of noise. It can be argued that the spatially coupled inversions carry out convolutions instead 55 of deconvolutions, but it is also true that any spurious, high-frequency signal may be compatible with the inverted models, provided it is washed out by the convolution with the PSF.
What is not clear either to the authors of the present review is the quite unexpected quantitative results in some applications. For example, van Noort et al. (2013) claim to have found magnetic field strengths of 7.5 kG at log τ = 0 in some parts of the penumbra. These values can easily break the observational paradigm where fields stronger of 4 kG have very seldom been observed. Since they may represent a new paradigm, they should be accompanied by an estimate of uncertainties that is not present in the paper. First of all, the fits obtained for the V profile in their fig. 3, seem to be different from the observations by at least an order of magnitude greater than the noise at several wavelengths. Such a fit cannot be considered satisfactory. But even more important is the fact that the quoted field strength corresponds to very deep layers in the atmosphere. As explained by Ruiz , the second term in Eq. (30) rapidly tends to zero at low layers because the difference between the Stokes profiles and the source function vector quickly vanishes at these layers. In these circumstances, it is easy to see that values for the magnetic quantities at log τ = 0 are extremely uncertain because the RFs go to zero (Equation 42). Unless otherwise justified, those strong values at low layers cannot be interpreted but as (not-very-accurate) extrapolations of the magnetic field strength global stratification if SPINOR uses the equivalent response functions at the nodes of Section 7.2.1. 22 If instead, the RFs are the regular ones, then we are afraid that the (uncoupled) inversion strategy should be modified by changing the nodes to other places in the atmosphere with greater sensitivity to perturbations in the magnetic and dynamic physical quantities.

Regularized deconvolution inversions
A much simpler and computationally cheaper approach has been proposed by Ruiz Cobo and Asensio Ramos (2013). Based on the idea of Stokes profile expansion in terms of the principal components provided by a regular PCA technique, instead of deconvolving the Stokes profile images wavelength by wavelength, which is a very expensive and risky process, 23 deconvolution is applied to the PCA coefficient images. The resulting Stokes profiles after deconvolution are then inverted with SIR. The procedure is neat and simple, the uncertainties in the determination of physical quantities can be obtained through Eq. (42), and the possible overcorrections due to an excess in deconvolution can easily be controlled.
The idea is to assume that PCA expansions up to a degree D are valid to describe the profiles fully before they reach the telescope. Under such an assumption, the observed Stokes profiles can be written as where ω i are the weights to the PCA eigenprofiles φ i (λ), P stands for the spatial PSF of the instrument, and N is the noise (assumed independent of wavelength). One of the interesting features of this regularized deconvolution is that the noise contamination is largely minimized since the real signal is usually contained in the first few coefficients. Then, the 4 × N images (ω i * P ) + N are deconvolved by a Richardson-Lucy algorithm (Richardson, 1972;Lucy, 1974). The resulting deconvolved Stokes profiles are then inverted with SIR. The main objection one may find to this technique is the same as that for PCA, namely that the most, say, peculiar Stokes profiles, which indeed reveal very interesting physics, are not fully fit since one would need more PCA terms. The logical solution for such a problem would be to increase D but this may not be advisable as the final computation time would be too long. A practical circumvention can be provided by classification techniques (e.g., k-means clustering) that allow the grouping of the different Stokes profiles according to purely morphological criteria (MacQueen, 1967;Steinhaus, 1957;Lloyd, 1982). 24 After such a classification, those peculiar profiles can be identified. One can then carry out the PCA expansion tailored to each group of profiles, hence optimizing the global performance. An increase in expansion terms may only be needed for small fractions of pixels in an image, thus keeping the whole inversion efficient.

Sparse inversions
An extremely interesting new generation of inversion methods has been proposed by Asensio . Based on the concept of sparsity or compressibility, this technique allows us to tackle the inversion of 2D maps -and potentially 3D data sets-all at once. The underlying idea of sparsity is the intrinsic redundancy of the data. That is, data can be projected to a parameter space where a reduced set of variables can fully describe that data set. Provided that a linear transformation exists between the data set and the new -small sized-parameter space, an affordable inversion of 2D maps can be carried out. In this first paper the authors present a 2D ME inversion based on a wavelet transformation of the model parameter space. They show that reducing the dimensionality of the model unknowns by a factor between three and five yields results comparable to -or even better than-pixel-to-pixel inversion.
Time saving is among the advantages of this kind of methods, along with their ability to easily compensate for the effects of the telescope PSF and the regularization of solutions introduced by the sparsity hypothesis. Among the drawbacks we find the apparent impossibility of using LM methods -since the Hessian matrix scales as the square of the number of free parameters. The authors suggest using proximal algorithms that can increase the convergence speed of the standard gradient descent method that it is currently used.

Summary of inversion techniques
This section summarizes in Table 1 all the past and current inversion techniques that have been proposed or are in use for solar physics. A distinction is made between those techniques that assume physical quantities that are constant with optical depth and those that allow the quantities to vary over the photosphere. LM stands for Levenberg-Marquardt, ANN for artificial neural networks, GA for genetic algorithm, B for Bayesian, and GD for gradient descent. The overwhelming majority of codes uses the Levenberg-Marquardt algorithm in order to find the minimum distance between the observed and the synthetic profiles. 25 26 27 Following the approach of Sects. 3 and 4, we want to discuss in this section how a given set of Stokes profiles can be fit with several assumptions about the stratification of the model atmosphere physical quantities. This discussion sheds light on the ill-conditioning issue we reported on in Sect. 1: the same profile can be interpreted in several ways, depending on the complexity of the assumed model atmosphere; the only limiting factor for increasing such a complexity should be the noise in the observations and a reasonable dose of the principle of Occam's razor. To illustrate our discussion we shall be using SIR to carry out all the necessary calculations, in a similar way to that followed by Bellot Rubio (2006). Given the SIR strategy explained in Sect. 7.2.1, the natural way to making a given physical quantity stratification more complex is by increasing the number of nodes and, hence, the polynomial degree of the spline that is assumed to describe such a stratification. Therefore, we shall be inverting the "observed" profiles with various sets of nodes for each of the different atmospheric quantities. Among all possible node combinations or modes, we have selected only six for the sake of simplicity. Each mode is characterized by the number of nodes, n B , used for B, γ, ϕ, and v LOS . The number of nodes for T , n T , is higher by two nodes than that for the magnetic and dynamic quantities, except for mode 1 where n T = 2. Mode 1 can be called "à la ME" because it has just n B = 1. Since the starting guess model atmosphere has constant B and v LOS , only constant values for these quantities can result from this inversion. Mode 2 has n B = 2, so that linear stratifications are allowed in this mode. Mode 3 has n B = 3; hence, parabolic stratifications can come out from SIR in this mode. Mode 4 has n B = 5 and the stratification of the magnetic and dynamic quantities can be quartic. Mode 5 has n B = 7 and the stratifications can be of order 6. Finally, mode 6 uses the automatic node selection algorithm described in Section 7.2.2.
We have built a penumbral model atmosphere after making up a bit one of the resulting models from inversion of a Hinode observation. We will call it hereafter the observed model. Our choice is driven by the shape of the Stokes profiles emerging from such an atmosphere. They are far from being typical even and odd functions of wavelength. The stratifications for v LOS , B, γ, and ϕ in the observed model are plotted (from top to bottom) with black lines in the left panels of Figs. 25 and 26. With this model atmosphere, we have synthesized the two Fe i lines at 630.1 and 630.2 nm, convolved them with the Hinode spectropolarimeter PSF, sampled with the instrument wavelength sampling interval, and finally added noise to a level of 10 −3 I c . The so-obtained Stokes profiles will be called the observed Stokes profiles and are plotted in black lines in the middle panels of Figs. 25 and 26 (despite being barely discerned). Besides the observed model and profiles, both figures display the resulting models and fit profiles from the corresponding mode runs of SIR. Green, red, and blue lines correspond to modes 1, 2, and 3, respectively, in Fig. 25, and to modes 4, 5, and 6, respectively, in Figure 26. The right panels of the two figures show the differences between the observed and the fit profiles in the corresponding colored lines. These differences provide a direct measure of the fit quality.
Theà-la-ME inversion yields a fairly good fit although, as expected, it is unable to reproduce the asymmetries in the profiles. The typical misfit is never larger than 10 %. The results given by theà-la-ME inversion coincide with the actual values at log τ c ≃ −1.5. The exact coincidence takes place at different depths for each quantity but the important qualitative message to be extracted is that, in spite of its simplicity, the ME approximation is able to retrieve the atmospheric quantities at the mid-photosphere. Looking at the differences in the right panels of Fig. 25, the parity rules we commented on in Sect. 4 seem not to operate but the reason is clear: the ME model is still too far from the observed model for linearity to hold. Moreover, given the strong asymmetry shown by the profiles, the specific wavelength around which we should symmetrize or anti-symmetrize the  profiles has to be calculated because it is certainly different from the nominal rest wavelength of the spectral line. The linear stratification mode does a better job as it provides a mean gradient with which asymmetries start to be reproduced. The fits clearly improve with the parabolic mode. Note that, besides having reduced the misfits, the stratifications of v LOS , B, γ, and ϕ are fairly well mimicked between log τ c = −3 and log τ c = −0.5. Note that the uncertainties evaluated with Eq. (42) -and displayed in the figure with shaded gray areas-indicate very well the range of reliability of the resulting stratifications. This is very important in practice where the observed model atmosphere is unknown. In our example, the uncertainties for the three components of the vector magnetic field are almost compatible with the linear stratification results. This is not the case for the LOS velocity where deviations are apparent and pave the road for more complex stratifications to improve the fits. In spite of this fact, the real gauge for deciding to proceed in the increase of nodes through the optical path is noise. As we have been discussing in many places in this paper, only if the differences between the observed and the synthetic profiles are larger than a few times the rms noise can we expect to obtain improvements with alternative model atmospheres. Since our noise still looks small enough when compared with the Stokes profile differences, we try with modes 4, 5, and 6. The retrieved model atmospheres are better than the former and, in particular, the uncertainty shaded areas in the top panels of Fig. 26 (they correspond to mode 5) indicate that indeed the range of reliability has extended up to log τ c = 0. Notice that the size in the difference panels of the figure for mode 4 indicates that there is still some room to improve the fits, while mode 5 and 6 have profile differences compatible with the noise of the observations. Therefore, we cannot go any further (but indeed the fit quality is superb). This example illustrates the ability we can have to retrieve very complex stratifications when both the noise is low and asymmetries are present. The latter feature is indeed important. On the one hand, if no asymmetries are present in the observed Stokes profiles we can readily discard part of the complexity: no variations of the LOS velocity with optical depth are present. On the other hand, and very remarkably, asymmetries increase the amount of available information: if profiles are symmetric (either even or odd), half of them can be thrown away although the retrievals will be noisier (see Section 6.1).
Since Stokes profile asymmetries have driven most of the evolution of concepts in inversion techniques and, in general, in radiative transfer, a further check on the way our numerical experiments have been able to reproduce those asymmetries is in order. Let us consider the typical definitions for the Stokes V amplitude, δa, and area, δA, asymmetries: where a b and a r stand for the amplitudes of the Stokes V blue and red lobes, and A b and A r do for the unsigned areas of those lobes, respectively. Figure 27 describes the performance of the different modes in reproducing δa and δA. Obviously, mode 1 shows zero asymmetries. Mode 2 approaches significantly the amplitude asymmetry but not the area one in this example. Mode 4 almost reproduce both quantities. The noise level (represented by the horizontal shaded areas) is so low, however, that only mode 5 and 6 provide an exact result. Certainly, if we decrease the amount of information by, for example, decreasing the number of wavelength samples and/or the polarization signal (because of the magnetic field being weaker) we cannot retrieve such complex model atmospheres any longer or, said otherwise, the range of reliability of the results will narrow down significantly so that low-order polynomial approximations are enough to give account of the observations. To exemplify this case, we have simulated a typical Sunrise/IMaX observation in mode V5-6, that is, five wavelength samples and six accumulations. The IMaX Fe i line at 525.02 nm was synthesized in a quiet-sun model atmosphere (again taken from the actual observations), convolved with the IMaX PSF, sampled at -8, -4, 4, 8, and 22.7 pm from line center, and added noise at a level of 10 −3 I c . Since Q c = U c = V c = 0 except perhaps in cases of very large LOS velocities, we just count in a total of 5 × 4 − 3 = 17 observables and, consequently, cannot afford to retrieve more than 17 unknowns. These 17 free parameters would cope with our mode 3 (five nodes for T and three nodes for v LOS , B, γ, and ϕ). Indeed, this number is too high in practice and can only be reached in cases of strong asymmetries for the reasons we have just explained above: symmetries in the profiles reduce the degrees of freedom. Let us then consider mode 2 as the maximum achievable run and invert the observed profiles. Figure 28 is similar to Figs. 25 and 26 and the color codes are the same. 28 The conclusions are clear, theà-la-ME mode provides fair values, and the linear approximation gives a reliable gradient on the physical quantities at around log τ c = −1.5.

Inversion retrievals of weak fields
The reliability of inversion retrievals from zones with weak fields is a continuous matter of debate. Concerns are often published with different levels of arguments. As in any other aspect of life, criticism is always more prevalent than praise in any community. This is the case when discussing the ability of spectropolarimetric observations to distinguish weak fields and their inclinations. Most discussions are strongly biased by the fairly common misconception of Stokes V being proportional to the longitudinal component of the magnetic field. We have shown in Sect. 3.1.2 that this approximation is valid only for a very limited range of values, and that the important observational parameter when dealing with weak signals is noise. Stokes profiles other than V also provide information about B. As long as the signal is not buried by noise, radiative transfer is powerful enough to provide sufficiently accurate magnetic quantities.
Sometimes the criticisms, are not correctly interpreted. An example is the evidence shown by Martínez González et al. (2006) that the pair of Fe i lines at 630 nm is not able to provide a single model for a scenario in which two depth-dependent atmospheres, one magnetic and another nonmagnetic, fill a spatial resolution element of about 1 ′′ . This true result has often been interpreted as if the famous pair of lines were unable to provide a reliable inference of the vector magnetic field, and that only infrared lines were valid for such a diagnostic. Del Toro Iniesta et al. (2010) explained that the scenario used by the former authors was perhaps too complicated for the available information. That is, that the visible line profiles are not enough to cope with the number of free parameters in a two-component, depth-dependent atmosphere. A simpler model atmosphere with just one magnetic component (and the other non-magnetic) may fit the profiles well enough. The latter authors gave both theoretical and observational arguments to defend the hypothesis that visible lines are reasonable diagnostics even for weak magnetic fields, in spite of infrared lines being more sensitive.
Later, Borrero and Kobel (2011, see also 2012) raised doubts about the retrievals of fairly inclined fields when the polarization signals are very weak because they come from quiet, internetwork regions (e.g., Orozco Suárez et al., 2007b, Lites et al.,2008. In our opinion, again, their claims are partial misinterpretations of the results, as we try to demonstrate with the calculations that follow. Consider the pair of Fe i lines at 630 nm sampled as in the Hinode (Kosugi et al., 2007) spectropolarimeter . We have synthesized these lines in a quiet-Sun model for constant magnetic field strengths of 10,15,20,25,40,50,60,75,90, and 100 G. The inclination and azimuth may take any value but preserve an isotropic distribution, so that each of these values is equally probable. The LOS velocity has been set to zero for all the profiles. One thousand Stokes profile sets have been calculated for each value of B. Once synthesized, white noise has been added to the profiles with a standard deviation of 10 −3 or 3.3 · 10 −4 I c , simulating S/N = 1000 or 3000. 29 The synthetic profiles were then inverted with SIRà-la ME (two nodes in T and one node for the rest of parameters). The results are summarized in Figure 29. Fields weaker than 75 G (25 G) from the S/N = 1000 (3000) inversions tend to be overestimated but in none of the cases is the excess output such that it would retrieve too strong a magnetic field. As a matter of fact, the results illustrate very well the fair reliability of the B inference in practically all circumstances. Again, when noise decreases the inversion results improve. The magnetic inclination results, represented in the right panel of the figure, are also very illustrative of what is going on. Only results from the S/N = 1000 experiments are plotted. One can clearly see that the weakest fields tend to result in an excess of inclined fields. This is, however, an expected result that has nothing to do with any special inability of the visible lines or with the sought after proportionality between Stokes V and the longitudinal component of the magnetic field. It is rather a consequence of the noise dominating the polarization signals. When the field is very weak, Stokes V is very small, barely exceeding the noise level. At the same time, Stokes Q and U (which should theoretically be zero) simply show noise. Since the V signals are not sufficiently larger than the linear polarization signals, the inversion code has no other option than to interpret the observations as very inclined magnetic fields: it is mostly fitting noise in Q and U . The situation clearly improves as the field strength increases. The inclination distribution is well recovered for B = 100 G fields, even when S/N is only 1000. 29 We have followed here the customary procedure of adding equal rms noise to all four Stokes parameters. However, as stressed by Del Toro Iniesta and Martínez , smaller noise should be added to Stokes I, which is always much better measured. This is so because almost all polarimeters are more efficient for Stokes I. In the optimum polarimeter case where Stokes Q, U , and V have the same polarimetric efficiency, that of Stokes I is higher by a factor of √ 3.  Figure 29. Bottom panels: same as the upper ones but only for those points where the polarization signal is higher than a threshold (see text for details). Black lines correspond to the input and red lines to the output fromà-la ME inversions.
Let us now consider a distribution of magnetic field strengths according to the probability density function (PDF) obtained by Orozco Suárez et al. (2007b) from Hinode observations. With these field strengths and an isotropic inclination distribution such as that for Fig. 29, we have synthesized 10000 Stokes profiles to which white noise of rms amplitude of σ = 10 −3 · I c was added. A-la ME inversions with SIR have been carried out. Both the inputs (black lines) and the results (red lines) are plotted in the upper panels of Figure 30. Fields weaker than 20 G are slightly overestimated, but above 60 or 70 G the strength PDF is very nicely recovered. The inclination PDF shows an excess of horizontal fields in detriment of the more vertical ones. The same PDFs are shown in the lower panels for a selection of pixels where the maximum polarization signal (max{|V |, (Q 2 + U 2 )}) is greater than 4σ. As one can clearly see, fields weaker than 10 G and a good portion of horizontal fields almost disappear. The inversions react very well. The underestimation of small inclinations and the excess of large ones can be attributed to insufficient S/N. When the experiments are repeated with higher signal-to-noise ratios, the PDFs agree accordingly better.

Conclusions
The inversion of the radiative transfer equation has been presented as a topological problem that maps the space of observables, the Stokes parameters, onto the space of the object physical quantities. The dependences of such a mapping on the definition of the two spaces implies a number of assumptions that are explicitly or implicitly made by any inference technique, regardless of it being called an inversion or not. Such assumptions determine to a great extent the uncertainties in the astronomical inferences, which depend on both the measurement errors and the analysis technique.
In the observational space, one has to select the parameters to be measured and the level of noise with which such measurements are carried out. Signals (measured parameters) are useful insofar they vary after a modification in the object physical quantities. For the variation to be detectable it should be higher than the noise. If the signal does not change above noise levels after a perturbation in the physical quantities, then it is useless and must be discarded. In the object physical space, the number of quantities and their assumed stratification with depth in the atmosphere are the key variables. If a physical quantity at a given depth in the atmosphere produces no measurable effect on the Stokes spectrum, then this quantity should not be looked for through the inversion process. The number of these physical quantities should not exceed the number of observables.
The mapping between the two spaces is nothing but radiative transfer. Depending on the spectral line and the way we measure it (that is, the number of wavelength samples, the width of the spectral PSF, etc.) the transfer can be studied through the full non-LTE problem, the LTE approximation or, rather, through further simplifications such as the Milne-Eddington approximation, the weak field approximation, etc. Strictly speaking, no available inversion technique deals with the full non-LTE problem and the only non-LTE code, NICOLE, relies on several approximations such as the fixed departure coefficient approximation or the field-free approximation in order to make the numerical problem tractable.
We have provided arguments in favor of proceeding through a step-by-step approach in which the complexity of the problem increases sequentially until convergence has been reached. In this sense we strongly recommend initializing inversions with classical estimates of B and v LOS as provided by the center of gravity technique, and estimates of γ and ϕ as provided by the weak field approximation. The criterion for convergence has to be established in terms of noise: if the (rms) difference between observed and synthetic Stokes profiles is less than the typical noise of observations, increasing complexity in the object physical description adds no information. In this regard, we have given both conceptual and technical arguments for the MISMA hypothesis and inversion technique to be abandoned. At the other extreme of the "complexity spectrum", the weak field approximation must only be used with much care and mainly for very broad chromospheric lines. Nothing in the transfer equation indicates that Stokes V is proportional to B cos γ. Only some matrix elements of K are. After integration through the atmosphere, the proportionality is most probably lost. Moreover, the information provided by the other Stokes parameters helps in disentangling the magnetic field strength from the inclination. In particular, Stokes I very soon departs from the zero field conditions that are strictly necessary for the weak field approximation to apply.
The step-by-step approach we suggest (and which is indeed implemented in the SIR inversion code) is to be preferred for two reasons: on the one hand, the atmospheric stratification of physical quantities can be described by a Taylor-expansion-like method where it is assumed to be first constant, then linear, later quadratic, and so on; on the other hand, the Stokes profiles (I in line depression), as functions of the wavelength, belong to IL 2 and, hence, can be expanded in terms of orthonormal bases such as those provided by the Hermite functions or those built for PCA techniques.
The various algorithms used for the inversion problem have been reviewed. They include the most widespread Levenberg-Marquardt algorithm, database search inversions, artificial neural networks, genetic algorithms, and Bayesian inferences. The most promising techniques for the near future, namely those which include spatial degradation by the telescope, are also discussed. Among these we find the so-called spatially-coupled inversions by van Noort (2012), the regularized deconvolution inversions by Ruiz Cobo and Asensio Ramos (2013), and the sparse inversions by Asensio . Suggestions for improving their performance and reliability are also given.
This paper ends with a discussion on a pair of topics that might seem controversial. The first one is a description of how the current implementation of SIR deals with a step-by-step approach. The sequential improvement on the fits is explicitly shown. The second topic discusses the reliability of weak field retrievals. The idea of a theoretical inability of Zeeman-sensitive spectral lines in the visible for inferring weak fields accurately is refuted. Instead, the root of the problem is shown to be in the signal-to-noise ratio of the observations. If noise is suitably low, then radiative transfer provides the necessary tools for accurate retrievals. Of course, uncertainties will be proportionally larger than when signals are bigger (i.e., stronger fields), but there are no reasons for not trusting the inversion results.

A Appendix. Optimum practical initialization
The center of gravity technique was introduced by Semel (1967, see also Rees and Semel 1979). The longitudinal component of the magnetic field (B LOS = B cos γ) and the line-of-sight (LOS) velocity are provided in this technique by the semi-difference and the semi-sum of the centers of gravity of S + ≡ I + V and S − ≡ I − V . Specifically, if λ + and λ − stand for such centers of gravity, then, assuming a unity magnetic filling factor, and where β B = 1/C, β v = c/λ 0 , and C = 4.67 · 10 −13 λ 2 0 g eff , with λ 0 the rest, central wavelength of the line inÅ; c stands for the speed of light. In practice, λ + and λ − are calculated through According to Landi Degl'Innocenti and Landolfi (2004), in the weak field regime, the azimuth of the magnetic field can be approximated by and the circular (Stokes V ) and linear (L ≡ Q 2 + U 2 ) polarizations can be approximated respectively by Eq. (24) for all wavelengths and by L ≃ 3 4 ∆λ 2 BḠ sin 2 γ 1 ∆λ for the line wings, where ∆λ ≡ λ − λ 0 , that is, the distance to the line center. Both g eff andḠ depend on the quantum numbers of the two atomic levels involved in the transition. If we write Eq. (18) as with g s = g u + g l , g d = g u − g l , thenḠ can be written asḠ = g 2 eff − δ, with with s ≡ [j u (j u + 1) + j l (j l + 1)] and d ≡ [j u (j u + 1) − j l (j l + 1)].
Now, from Eqs. (62) and (24) The two last equations finally give tan 2 γ = 4 L g 2 eff ∆λ 3Ḡ C B LOS since B LOS has been independently obtained trough Eq. (58). Therefore, using only one or the average of some wavelength samples in the line wings, we obtain rough estimates for the magnetic inclination angle through Eq. (69) and for the magnetic azimuth through Equation (61). Having γ and B LOS provides an estimate for B as well. Note that the strict validity of the equations is unimportant because our only aim is to find approximate guesses for the inversion code to start. 70