On Identifying and Mitigating Bias in Inferred Measurements for Solar Vector Magnetic-Field Data

The problem of bias, meaning over- or under-estimation, of the component perpendicular to the line-of-sight [\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$B_{\perp }$\end{document}B⊥] in vector magnetic-field maps is discussed. Previous works on this topic have illustrated that the problem exists; here we perform novel investigations to quantify the bias, fully understand its source(s), and provide mitigation strategies. First, we develop quantitative metrics to measure the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$B_{\perp }$\end{document}B⊥ bias and quantify the effect in both local (physical) and native image-plane components. Second, we test and evaluate different options available to inversions and different data sources, to systematically characterize the impacts of these choices, including explicitly accounting for the magnetic fill fraction [\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$f\!\!f$\end{document}ff]. Third, we deploy a simple model to test how noise and different models of the bias may manifest. From these three investigations we find that while the bias is dominantly present in under-resolved structures, it is also present in strong-field, pixel-filling structures. Noise in the spectropolarimetric data can exacerbate the problem, but it is not the primary cause of the bias. We show that fitting \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$f\!\!f$\end{document}ff explicitly provides significant mitigation, but that other considerations such as the choice of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\chi ^{2}$\end{document}χ2-weights and optimization algorithms can impact the results as well. Finally, we demonstrate a straightforward “quick fix” that can be applied post facto but prior to solving the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$180^{\circ}$\end{document}180∘ ambiguity in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$B_{\perp }$\end{document}B⊥, and which may be useful when global-scale structures are, e.g., used for model boundary input. The conclusions of this work support the deployment of inversion codes that explicitly fit \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$f\!\!f$\end{document}ff or, as with the new SyntHIA neural-net, that are trained on data that did so.


Introduction
It is a challenging problem to infer the magnetic-field strength and direction in the solar photosphere, as it threads a dynamic plasma (see del Toro Iniesta and Ruiz Cobo, 2016, and references therein). The assumptions of a Milne-Eddington (ME) atmosphere provide a good estimate of the average strength and direction across the layers where magnetically sensitive photospheric spectral lines are formed (Westendorp Plaza et al., 1998), especially in structures where a single pixel-filling magnetic component is present or at least dominant. Under-resolved multiple atmospheres (whether all magnetized or not) contributing light to Extended author information available on the last page of the article the resolution element result in polarimetric signals that are intensity-weighted averages of the contributing atmospheres (Sanchez Almeida, 1997; Leka and Barnes, 2012), which rarely resemble expected ME Stokes spectra (even without complications from unresolved velocity components or gradients within the line-forming region).
The treatment of instrumental scattered light and the approach used to estimate relative contributions of magnetized vs. unmagnetized incoming light (the magnetic "fill fraction" [ff ] or percentage of a pixel filled by magnetic field) will influence the inferred nature of magnetic structures, especially (but not solely) unresolved structures (Lites et al., 1996;Socas Navarro, 2004;LaBonte, 2004;del Toro Iniesta, Orozco Suárez, and Bellot Rubio, 2010;Orozco Suárez and Katsukawa, 2012;Leka and Barnes, 2012;Bommier, 2016;Sainz Dalda, 2017). The particulars of how inversion techniques are invoked by which to infer the magnetic field from the spectropolarimetric data, even "standard" Milne-Eddington inversions, will influence the results. The particulars can include the number and which spectral lines are used, plus mundane-seeming choices of optimization algorithms, stopping conditions, and any optimization weighting applied to the χ 2 -calculation (see, e.g., Centeno et al., 2014;Sainz Dalda, 2017, for a discussion).
As has been introduced at length by Pevtsov et al. (2021) and Liu et al. (2022), with vector polarimetry and inversions covering the full visible solar disk from, e.g., the Vector Spec-troMagnetograph (VSM: Keller and The SOLIS Team, 2001;Henney, Keller, and Harvey, 2006) of the Synoptic Optical Long-term Investigations of the Sun facility (nso.edu/telescopes/nisp/solis/) and now the Helioseismic and Magnetic Image (HMI: Scherrer et al., 2012;Schou et al., 2012;Centeno et al., 2014;Hoeksema et al., 2014) onboard the Solar Dynamics Observatory (SDO: Pesnell, Thompson, and Chamberlin, 2012), there is the capability of estimating the vector components of the photospheric magnetic field over large areas of the Sun. Unfortunately, as demonstrated recently (Rudenko and Dmitrienko, 2018;Pevtsov et al., 2021;Sun et al., 2021;Liu et al., 2022), large-scale magnetic structures as observed by these facilities can present behavior that is physically unexpected. Specifically, a preferred direction in polar fields or a "flip" in the direction of the zonal-directed (East/West) horizontal component of magnetic structures is inferred upon crossing the central meridian. Given that the opposite signs of this bias are present in data from different facilities (Pevtsov et al., 2021;Sun et al., 2021), the problem is well-established as originating from instrumentation, data preparation, and/or inversion, rather than being solar in origin. We explore these options further, below.
What has not been emphasized yet is the impact on physical interpretation. The overestimation of the transverse component B ⊥ contributes to the inferred physical components [B h x , B h y , B h z ], or heliographic / local components of the field vector, in a non-linear way, according to the inherent underlying structure as well the viewing angle [θ ] (i.e. the structure's location on the solar disk).
It has been proposed that the most susceptible features are those that are unresolved. As hypothesized by Pevtsov et al. (2021) and demonstrated by Liu et al. (2022), including the magnetic fill fraction [ff ] explicitly as part of a Milne-Eddington solution changes the inferred field vector, such that the bias may be mitigated substantially. This is not a new point (Ronan, Mickey, and Orrall, 1987;Lites and Skumanich, 1990;LaBonte, 2004;Bommier et al., 2007;del Toro Iniesta, Orozco Suárez, and Bellot Rubio, 2010;Sainz Dalda, 2017). Forcing the magnetic fill fraction ff = 1.0 for unresolved structures is likely the dominant source of this bias; as shown by Liu et al. (2022), correcting this assumption by invoking a new version of the Very Fast Inversion of Stokes Vector Milne-Eddington approach (Borrero et al., 2011;Centeno et al., 2014;Griñón-Marín et al., 2021) can mitigate the magnitude of the bias. However, as also shown by Pevtsov et al. (2021), the magnitude of the photon noise can contribute. The imaging spatial integrity, field of view, and acquisition cadence are all drivers of instrument design, the outcomes of which are as different as the questions that they are optimized to answer, cf. SDO/HMI vs. Hinode/SpectroPolarimeter (SP: Lites et al., 2013), and we use both SDO/HMI and Hinode/SP to investigate not just fill-fraction fitting, but noise, instrument, data specifics, and inversion implementation.
Most importantly, we present quantitative metrics by which to evaluate the extent and direction of the bias 1 and the contributing factors, and finally demonstrate one "quick and dirty" correction that is shown to mitigate (although not necessarily completely correct) the bias in any data for which certain structures are observed and good coverage in observing angle is available.
We begin the methodology section (Section 2) with a description of the data, target acquisition, and analysis methods and evaluation metrics, plus a description of a simple model used in the investigation. This is followed by a summary of results (Section 3) and finally a demonstration of a rudimentary correction approach (Section 4) for when the option of re-inverting the data is not available.

Methodology
In this section we present the algorithms for selecting features, the observational targets, and analysis methodology.
The overall issue for SDO/HMI pipeline output is that the inferred component in the plane-of-the-sky [B ⊥ or B trans ] is stronger than it "should be" compared to the line-of-sight component [B or B los ], most apparently for areas with unresolved magnetic structure, i.e. ff < 1.0. For data from the VSM, according to Pevtsov et al. (2021), B ⊥ it is weaker than it "should be". We refer to any such imbalance as the bias in B ⊥ but also refer to the imageplane (as returned from the inversion) inclination γ i ∈ [0 • , 180 • ] where 90 • is in the planeof-the-sky. In some cases, the "polarity" of B is inconsequential or adds confusion, in which case, we limit the inclination, referred to as |γ i | ∈ [0 • , 90 • ]. Throughout, while we may discuss ff as distinct from the field magnitude |B|, it is their product that is evaluated.

Target Features
Solar plage is generally agreed to be comprised of kilo-Gauss concentrations of predominantly radially directed field (along the direction of gravity), generally not strong enough to form a continuum depression (but having magnetic and thermal impacts sufficient to produce continuum-intensity enhancements when viewed at an angle), the small size of these concentrations means that they are rarely resolved with today's instruments. Plage areas are not inherently "weak field" (cf. the "magnetic knots" in Rudenko and Dmitrienko (2018)). The inability to resolve the structures leads to a canonical area-averaged (pixel-averaged) fluxdensity estimate for plage of 1 − few × 100 Mx cm −2 , depending on instrument specifics. Plage is a wide-spread source for "global-scale" solar magnetic-field structure, and it contributes strongly to synoptic (or synchronic) full-Sun magnetic maps.
The overall approach that we take relies on the analysis of structures on the Sun where the magnetic inclination is known, at least on a statistical basis: plage regions and the darkest umbrae of very stable sunspots. These two targets, when carefully selected, should be dominated by radially directed field (for SDO/HMI spatial resolution, spectral sampling, and sensitivity). It is not required that the orientation be only radial, just that it be dominated by radially directed field.
In cases where full-disk data are not available, there is an additional assumption invoked, that these structures would display minimal large-scale variation over the course of days, again on a statistical basis (for appropriately stable sunspots).
Different between these two structures are the magnetic fill fraction, with the darkest umbrae of sunspots presenting field-filled pixels (ff = 1.0), whereas plage should present un-resolved bundles of field within field-free (or field-less) plasma (ff < 1.0). Both plage and sunspots generally present polarization signals well above the noise. The two types of features appropriate for this analysis are selected thus: A: The very central, darkest portions of sunspot umbrae in spots that are large and not evolving noticeably. To select these areas, A 1) Determine that the size, complexity, morphology, and evolution of the target sunspot are large, nonexistent, round, and steady, respectively. A 2) Create four masks: • Mask 1, Continuum intensity: select the darkest points within sunspot umbrae. Specifically, for Hinode/SP data: an I c /median(I c ) limited-FOV image is smoothed with a 4-pixel boxcar, and those pixels < 0.4 are selected; for SDO/HMI: using I * , the normalized continuum intensity from the hmi.Ic_no-limbdark_720s data series (see Hoeksema et al., 2014, for a description of the different data series available from the SDO/HMI pipeline products), choose pixels with I * < 0.2. The resulting mask is then dilated by a 2 × 2 box. • Mask 2, Field Strength: ff × B > 200 Mx cm −2 ; tests showed that this level consistently retains plage areas but does not include noise across orbital variations in SDO/HMI data. This mask is then eroded by a 2 × 2 box to remove single-pixel detections. • Mask 3, Inclination: An image of the "local" or heliographic (or "physical") |γ h | is created, smoothed using a 4-pixel boxcar, and pixels with |γ h | < 30 • are identified; this mask is then subjected to first being eroded and then dilated, both using a 2 × 2 box. • Mask 4, Data Quality: for SDO/HMI, mask to include only pixels with conf = 0.0 from the confid_map segment, hmi.ME_720s_fd10 series, that indicate no failures in data quality or inversion (convergence, etc.). For Hinode/SP: only those pixels with total polarization P > 0.4%; those below are used for computing the non-magnetic spectral profiles. For all: within μ = cos(θ ) > 0.35 (within ≈ 70 • from disk center).
A 3) The four masks are added together and only pixels where all conditions are satisfied are selected. B: Plage areas are found either in the vicinity of a sunspot, or as active-region remnants that can cover many degrees of the disk. Arguably, any individual plage element evolves, but statistically speaking their distribution is not expected to differ due the hemisphere in which they are located. To select these areas, B 1) Again, construct a series of independent masks: • Mask 1, Continuum Intensity: where I c /median(I c ) > 0.9 (Hinode/SP) or I * > 0.9 (SDO/HMI). • Mask 2, Field Strength: as per Mask 2 above, but additionally grown with a 2 × 2 box to ensure coverage of plage concentrations. • Mask 3, Inclination: same as Mask 3, above.
• Mask 5, extended sunspot / super-penumbral exclusion: using continuum intensity, select all sunspot points by I c /median(I c ) < 0.9 (Hinode/SP) or I * < 0.9 (SDO/HMI). The result is eroded using a 2 × 2 box to remove single-point detections, then grown (dilated) with a large r = 25-pixel circular template to extend spot areas (including detected pores) beyond the horizontal-field superpenumbra.
B 2) Masks 1 -4 are added together and only pixels where all conditions are satisfied are selected. B 3) Mask 5 provides a negative-Boolean requirement, to remove not-dark but strong fields that can confuse the analysis.
The final masks included an additional identification of the polarity of B h z and the location on the disk (latitude, longitude). Creating the plage masks without Mask 3, meaning without a filter on the local-component inclination angle, results in 5 • difference in, e.g., the mean and medians of the inclination-angle distributions, by including non-radial points. As such, the interpretation of the bias may be systematically incorrect by a small amount, but avoiding having to resolve the 180 • ambiguity may be advantageous in some situations.

Hinode/SpectroPolarimeter
We chose scans from the Hinode/SpectroPolarimeter (Hinode/SP: Kosugi et al., 2007;Tsuneta et al., 2008;Ichimoto et al., 2008;Lites et al., 2013) that follow NOAA AR 12457 ( Table 1). The target includes a small sunspot that does not fit our criteria for spot analysis, but does include a well-developed and minimally evolving plage area ( Figure 1). Standard Level-1, the calibrated spectra, and Level-2, the MERLIN Milne-Eddington inversion results, were retrieved; Level 2.1 data were not used here, as we wanted to extend the disambiguation and ensure that it was consistent across inversion experiments (see Section 2.2.3).

Solar Dynamics Observatory/Helioseismic and Magnetic Imager
Data from the Solar Dynamics Observatory/Helioseismic and Magnetic Imager (SDO/HMI: Pesnell, Thompson, and Chamberlin, 2012;Scherrer et al., 2012;Schou et al., 2012;Centeno et al., 2014;Hoeksema et al., 2014) provide the known problematic data in this case, plus both time-series and full-disk data for testing inversion options and other mitigating procedures. To match the Hinode/SP data of AR 12457, we selected SDO/HMI segments close to the mid-times of the Hinode/SP scans (see Table 1) and extracted coincident FOV patches from the full-disk data (not as defined by the SDO/HMI Active Region Patch [HARP] 6124's bounding boxes). The pipeline data included series hmi.ME_720s_fd10 and hmi.B_720s, plus hmi.S_720s for some tests of inversions, and hmi.Ic_nolimbdark_720s for continuumfeature identification.
Two additional time-periods were identified for study, targeting the presence of large, round, stable sunspots and well-distributed plage: 04 December 2010 -12 December 2010 and 13 May 2016 -24 May 2016 (Table 1). Not all analysis methods were available for the earlier time-period, but it was valuable to confirm behavior with a second large sunspot and additional plage areas. Custom-sized boxes were defined and extracted at the solar synodic rotation rate, to follow the targeted structures across the solar disk. In addition to the magnetic-field-related segments, we utilized the confidence and the limb-darkenedcorrected continuum intensity (as mentioned above). Context images for the targets selected from SDO/HMI time-series data are shown in Figures 2, 3, 4, and 5.

Inversion Options
A number of different inversion options were evaluated, comprising two different groups of tests.
The first group used the Milne-Eddington inversion code developed for use with the NCAR/High Altitude Observatory Advanced Stokes Polarimeter (labeled ASP-ME; Skumanich and Lites, 1987;Lites and Skumanich, 1990;Leka, 1997) on Hinode/SP Level-1 data to systematically evaluate the impact of different implementation options. The options investigated include (see summary in Table 2):  imply different sensitivity to B but not to ff , so multiple lines provide additional constraints. iii) Explicitly fitting for ff vs. explicitly setting ff = 1.0. iv) Weights assigned to S = [I, Q, U, V ]: since the magnitudes of the polarization signals in [V ] and [Q, U ] are of order dI/dλ and d 2 I/dλ 2 , respectively, and photon noise scales accordingly, in order to insure that the I does not dominate the χ 2 in the evaluation functions, weights [w I , w Q = w U , w V ] are usually a parameter supplied for the χ 2calculation. We quote here w eff S = w S /σ S Care must be taken when comparing codes and their parameter settings, as some request w S while some refer to w 2 S .  The second group compares different inversion codes that may differ in a variety of ways. The different codes are tested on one or both of SDO/HMI hmi.S_720s polarization images or Hinode/SP Level-1 spectra, as listed in Table 3. The different systems tested include: • The pipeline Hinode/SP Level-2 output from the MERLIN Milne-Eddington code (heritage from ASP-ME). • The pipeline output from the Very Fast Inversion of Stokes Vectors (VFISV: Borrero et al., 2011;Centeno et al., 2014) from the hmi.ME_720s_fd10 series.  Higgins et al. (2021Higgins et al. ( , 2022. • The UNNOFit code (Bommier et al., 2007) was applied only to the AR 12457 targets, but with both Hinode/SP Level-1 data and SDO/HMI hmi.S_720s data as input. UNNOFit introduced the magnetic filling fraction as a free parameter of the Milne-Eddington in- version. All free parameters describing the non-magnetic part of the atmosphere are set to equal those of the magnetic part, except for the magnetic-field vector itself. This procedure was recently implemented in a new VFISV inversion (see next point), with a slight difference in the series of the eight other parameters, where UNNOFit also determines the Voigt parameter a and eliminates the source function at the photosphere base by normalization. UNNOFit was the first code that explicitly allowed a variable atmosphere for the non-magnetic component, although it is set to be identical to the atmosphere of the magnetic component. However, Bommier et al. (2007) also showed that nine parameters can exceed the information content of the spectra as needed for a successful inversion, and that only the pixel-averaged magnetic-field strength ff × B is finally determined. • A new version of the Very Fast Inversion of Stokes Vector (VFISV_ABGM: Borrero et al., 2011;Griñón-Marín et al., 2021) was developed for implementation with SDO/HMI data to explicitly include a fit for the fill fraction. Similar in the approach to UNNOFit (see above), it should be noted that the Voigt parameter is fixed at a = 0.5 and the source function at the base of the photosphere is a fitted parameter. The results from this inversion have been shown to mitigate the bias (Griñón-Marín et al., 2021;Liu et al., 2022). One difference between the VFISV_ABGM data used here and that presented by Griñón-Marín et al. (2021) and Liu et al. (2022) is that we used no polarization threshold for the inversion, whereas a threshold of 0.25% was used in those cited works. We removed the polarization threshold in order to ensure no discontinuity near weak-signal plage areas.
No additional scattered-light correction was performed. • The SyntHIA approach is not really an inversion per se; it is a Convolutional Neural Network that has been trained on the Hinode/SP Level-2 output with SDO/HMI hmi.S_720s polarization images as input (Higgins et al., 2021(Higgins et al., , 2022. The full-disk results provide field and fill-fraction separately, with an overall fidelity closer to Hinode/SP pipeline output than SDO/HMI pipeline output. We additionally test the question of polarization noise. As briefly demonstrated by Pevtsov et al. (2021), noise in the linear polarization [Q, U ] signals contributes to the bias. We conduct tests comparing inversions using input spectra from the SDO/HMI hmi.S_720s series to those using input spectra from the hmi.S_5760s series, which are 96-minute integrated Stokes spectra, and demonstrably lower in random (photon) noise. Not all possible permutations were executed, but a sufficient number with a sufficient range of options so as to quantitatively judge the impact of the different approaches on the Figure 6 Distributions of heliographic or local inclination angle γ h (black) and field strength (gray) for the "plage-like" distributions in the model. γ = 0 is radially directed (opposite gravity). These 1000-point samples were then placed across a variety of viewing angles.
bias. The tests are summarized in Tables 2, 3; please note the method labels will be used in later discussion and figures.

Disambiguation
The data from all inversions of both Hinode/SP and SDO/HMI data are inherently 180 • ambiguous in the plane-of-the-sky component. All inverted data that were compared using the local components were disambiguated using the "minimum-energy" method (ME0: Metcalf, 1994;Leka, Barnes, and Crouch, 2009;Hoeksema et al., 2014, available at www.nwra.com/AMBIG) with a cooling schedule that generally matched both SDO/HMI and the Hinode/SP Level2.1 data products: tfactr = 0.98, neq = 100. The primary difference in how ME0 was called relative to the SDO/HMI pipeline is that lower [athresh, bthres] = [95, 100] were used to capture more near-plage areas, and the spherical option was deployed for the larger-FOV SDO/HMI data. The local components were then computed using planar or spherical geometry, accordingly.

Model Data
To complement the analysis of the observational data, model data were constructed with which to test the hypotheses and the mitigation strategies. On a latitude / longitude grid ranging in both directions ±80 • in 2.5 • increments, 1000-point samples were generated to mimic expected distributions of plage: The radial component is a normally distributed random sample centered at 1000 Mx cm −2 with a standard deviation of 250 Mx cm −2 , the horizontal component was a normal distribution, standard deviation 200 Mx cm −2 ; the azimuthal angle was a uniform distribution across 2π ( Figure 6). Only B h z > 0 positive-polarity "plage" was considered here.
From this grid, coordinate transforms re-create the image-plane components |B|, γ i , α i or alternatively B i x , B i y , B i z , the last being B and B ⊥ = B i x 2 + B i y 2 . Noise was optionally added at "low", "medium", and "high" levels: a normal-distribution random sample with σ = [5, 10, 50] Mx cm −2 for B was paired with a normal distribution with σ = [100, 200, 300] Mx cm −2 for B ⊥ , although the absolute value was added (since the effective photon noise for total linear polarization is positive-definite, see discussion by Pevtsov et al. (2021)). The noise in azimuthal angle was a normal distribution with standard    deviation approximately [2, 4, 6] • . In an attempt to mimic the noise maps of SDO/HMI (see Hoeksema et al., 2014, their Figure 7), we tested adding noise at the levels listed above but scaled as 1.0/ √ μ such that the noise increases near the limb, but this made little difference in the final outcomes. No attempt was made to model the impact of unresolved field or incorrect fill fraction (although see Leka and Barnes, 2012).

Analysis and Metrics
Simply detecting a "sign flip" in a re-binned image of a local heliographic field component when a plage area transits the central meridian is a useful initial diagnostic (Pevtsov et al., 2021;Liu et al., 2022). However, it is difficult to quantitatively evaluate the bias in this manner (cf. Rudenko and Dmitrienko, 2018), or determine its full impact.
We develop quantitative metrics based on the assumed physical characteristics of solar structures in order to evaluate the bias. In some cases, temporal sampling is mapped to spatial sampling, again relying on the physical and statistical qualities of plage in that the distributions of the underlying fields are not expected to evolve significantly over the course of a few days.

Distributions of B h x , B h y , B h z with Central Meridian Distance and Observing Angle
Histograms of the heliographic B h x are presented separately for pixels with ±B h z , similar to Figure 4 in Pevtsov et al. (2021). Without the presence of filament channels, nearby large active regions, or other phenomena that could introduce a physical direction preference, the assumption is that the histograms should overlap: The distribution of B h x or B h y should not differ according to either the sign of B h z or viewing angle. Tracking the plage in NOAA AR 12457 over six days, we see the very different behavior between the Hinode/SP pipeline output (Figure 7) and the SDO/HMI pipeline output (Figure 8), with consistent <B h x > and <B h y > regardless of polarity and viewing angle for the former, and a switch of <B h x > between east/west hemisphere for the latter, as well as a consistent offset in <B h y >. In situations such as NOAA AR 12457 where there is a sufficient sample of plage with both polarities, the behavior is summarized by metrics describing the differences in the distributions of target magnetic components. Specifically, metrics should target the behavior of the B h x and of B h y most notably, plus B h z when it provides information. The absolute separation is the absolute difference of the medians of two distributions of the target B h x , B h y , or B h z , the two distributions being whether the underlying field has B h z > 0 or B h z < 0; the mean absolute separation is taken over the six samples. This is accompanied by the standard deviation of the signed separation, which is larger for a switch in sign and smaller if, for example, the distributions are separated but do not change significantly between samples. We additionally consider the median absolute deviation (MAD; med(|X −X|), whereX is the median (most probable) value for the target distribution) andX is its median over the six samples, and the maximum absolute deviation MaxAD, which is the maximum difference between the median of the target distribution (separately for B h z > 0 or B h z < 0) from its expected value: 0.0 for B h x and B h y and the mean B h z over the six samples, as we hypothesize little evolution.
This analysis is also applied to the sub-area targeted patches of SDO/HMI time-series data. In Figures 9 and 10 we summarize the behavior of the target distributions as a function of the mid-point longitude for all three vector components. Quantitative summaries take the same form as for the NOAA AR 12457 analysis: mean absolute separation, standard deviation of the signed separation, MAD, and MaxAD.

Distributions of |γ i | with Observing Angle
Ideally, there should be metrics available from the image-plane components themselves, without requiring the disambiguation step. If we describe the bias as an over-(or under-) estimate in the B ⊥ component, then this will manifest in the image-plane inclination angle γ i . It is in this context that focusing on plage and the radially oriented locations within stable sunspots becomes useful, because we can assume that for radially directed fields, B r = B /μ, where μ = cos(θ ) the cosine of the observing angle (Svalgaard, Duvall, and Scherrer, 1978;Wang and Sheeley, 1992). In other words, for radially directed fields without bias, γ i = μ. First, we examine the heliographic (local or physical) inclination angle of the target plage points in the NOAA AR 12457 data and confirm that the distributions are the same between the positive and negative regions, change minimally as a function of east/west location, and, while not exactly radial, are not particularly inclined (Figures 11, 12, top panels).
The distribution of |γ i |, however, distinctly changes shape as a function of observing angle, and it differs between, e.g., the SDO/HMI and Hinode/SP pipeline output (Figures 11, 12, lower panels). The skews of the distributions show exactly opposite behavior with east/west location between the two pipeline data outputs.
Figures 11, 12 (top panels) confirm that there is a distribution of orientations within these structures, of course, but they are dominated by radially directed field. Hence we focus on the expected value or most probable value E(|γ i |), and whether it tracks the observing angle μ = cos(θ ) (Figure 13). The degree to which E(|γ i |) vs. θ points lie off the x = y line provides information on the bias, including whether there is an under-or over-estimation of B ⊥ . This key diagnostic can be performed on limited-FOV time-series data for (statistically) unchanging structures, but can also be used for full-disk data without requiring any timeseries, provided the required structures are present.
In Figure 13, we see that the MERLIN E(|γ i |) vs. θ results align with x = y nearly perfectly, while the three examples from the SDO/HMI pipeline all show significant deviations, although with different functional forms. The differences between the two plage-targeted examples could be due to different observing epochs (2010 vs. 2016), or subtle changes in the tracked structures due to evolution or field-of-view. The deviation from x = y for the sunspot confirms the presence of the bias in structures that have strong, pixel-filled signal, presumably with ff = 1.0. The different form of E(|γ i |) vs. θ for the spot target compared to plage targets may be due to a combination of signal/noise ratio, signal saturation, scattered light treatment, plus the different impact of bias according to the treatment of ff .

Figure 11
Histograms of plage-identified areas of NOAA AR 12457 for the Hinode/SP MERLIN Level-2 pipeline-data output, prepared as described in the text. Left to Right: results from the six scans (Table 1); central meridian crossing is at approximately 25 November 2015 00:00, between the third and fourth scans. Red/Black indicate histograms of pixels that have positive/negative B h z ; shown here is the distribution of γ h (top) and γ i (bottom).

Figure 12
Histograms of plage-identified areas of NOAA AR 12457, and the same field-of-view as in Figure 11, but from SDO/HMI pipeline data, prepared as described in the text. Left to Right: results from the six scans (Table 1) The primary metric then to evaluate the distributions of |γ i | for plage as a function of μ = cos(θ ) is the Gini coefficient (G) or Receiver (Relative) Operating Characteristic (ROC) Skill Score (ROCSS: Jolliffe and Stephenson, 2012;Leka, Barnes, and Wagner, 2018) G = 2 × AUC − 1, where the AUC metric is the integrated area under the curve. Provided sufficient coverage is available in μ, G essentially measures a signed area departure from the x = y line. Other metrics such as the maximum absolute deviation could also be used here, but G is intuitive, for G > 0 indicates over-estimation of B ⊥ , whereas G < 0 indicates an underestimation. To compute G, we normalize the ranges to [0, 1] and extend the sampled ranges to those ends for all data, in order to cursorily account for different sampling in μ = cos(θ ) between datasets and tests. The choice of using angles in degrees with a normalization by 90 simply provides an intuitive presentation and AUC calculation; the same information is available using μ and cos(E(|γ i |)), similarly both normalized to [0, 1].

Results
There are a large number of tests that were performed; here we summarize the results for targeted questions, first the experiments with model data, then observational.

The Magnitude and Impact of the Problem
The most obvious impact of this bias manifests in the sign-flip of B h x as extended patches of unresolved structures rotate east/west across the solar disk, as described here and in previous articles (Pevtsov et al., 2021;Liu et al., 2022). The magnitude is demonstrated here quantitatively (Figures 8, 9, 10): up to a few × 100 Mx cm −2 signal of bias is present, or almost 30% of the total pixel-averaged field magnitude for the plage regions.
It was surmised in those articles that there should be a bias in the north/south distribution of B h y as well. This effect is shown quantitatively here (same figures). The B h y -component behaves in an opposite way for plage points with B r > 0 vs. B r < 0 when located north vs. south of the Equator. The effect is somewhat subtle due to the limited north/south extent available for analysis, but it is clear.
It was also briefly mentioned in previous articles that there may be an impact on the inferred radial component B h z or B r (the component most widely used for global modeling presently), and this is also demonstrated here. It is a smaller effect still, but can be seen as subtle peaks in the most probable pixel-averaged flux away from disk center, and a dip in the magnitude near disk center.
In summary, because this is a problem in the image plane, it impacts all vector components in physical space, including B r (B h z ). It will also lead to incorrect or biased estimates of any derived quantity such as the vertical current, the force-free parameter α, etc. The bias is strongest in unresolved structures, and as such will constitute significant portions of the active regions as well as the large-scale plage regions.
It was stated in the earlier articles that there was no bias in the strong-field or ff ≈ 1.0 areas. We find this is not actually the case. We show that in the central portions of two large, round, stable sunspots, there is a non-zero most-probable inclination. The mostprobable magnetic-field vector changes direction such that B h x changes (or nearly changes) sign as the spots transit the central meridian, and displays a non-zero difference in the most probable value of B h y as well. The variations in B h z are smaller than the orbital-velocity related variations and cannot be confirmed here.
The bias in ff ≈ 1.0 areas may be caused ultimately by an unresolved field or the detailed handling of scattered instrumental light (LaBonte, 2004). It indeed fails to cause a full sign reversal of the components, that is true. However, we disagree with Liu et al. (2022) that "This bias does not occur in strong-field regions in sunspots." The signature of the bias is present, but only apparent with a quantitative examination of the field-component distributions.

Results from Model Data Experiments
The simple toy model of plage-like distribution of field across a range of observing angles is used to test three possible contributing bias distributions: i) a constant magnitude of bias added to B ⊥ , ii) a contribution to B ⊥ that is a function of the total field, and iii) a contribution to B ⊥ that is a function of observing angle, mimicking that which would be expected with systematically higher fill fraction derived in regions away from disk center (hence the bias introduced by an imposed ff = 1.0 will be greater toward disk center). The goal here is to reproduce some of the quantitative characteristics observed such as the changing skew of γ i -distributions with observing angle and the behavior of the Gini coefficient with bias and its sign, and to understand the source of the varying degree of curvature of γ i as a function of observing angle (Figure 13). In most cases, both a positive and a negative bias are added, meaning bias such as to produce an over-estimate and an under-estimate of B ⊥ , respectively.

Figure 13
The most probable image-plane absolute inclination E(|γ i |) (y-axis) for plage areas in rings or bins of μ = 0.025 as a function of the central value for that ring (x-axis); both axes are in degrees but normalized by 90 • . Shown are Hinode/SP MERLIN Level-2.0 data for NOAA AR 12457 for all days sampled (black * ), and three targets from the SDO/HMI VFISV pipeline: full-disk data for 18 May 2016 (red ♦), Plage 3 (blue ), and Spot 2 (green ). Dashed line: x = y. The SDO/HMI full-disk data are sampled for 24 hours at 96 minute cadence (15 samples) and show as well the OBS_VR-related variation Schuck et al., 2016).
The distributions of |γ i | with observing angle show a general trend, as is seen in the Hinode/SP data (see Figure 11) of a changing shape with observing angle (Figure 14, Top). The changes are indeed influenced by the degree of photon noise present.
Tracking this change in skew as a function of the experiments performed regarding noise and bias is shown in Figure 14, Bottom. There is a distinct form of the change in skew of the |γ i |-distribution as a function of viewing angle, without any additional noise. The addition of photon noise and bias that originates according to the three experiments presents departures from the no-noise case. However, the "optimal" curve (the no-noise no-bias case) is not straightforward to describe: it is not a simple function of observing angle. The curves with added noise and bias depart from the optimal, but in sometimes subtle ways that do not provide unique diagnostic signatures. As such, we elect to not use the skew of the |γ i |distributions as a quantitative test of bias in the observational data.
The behaviors of the "observed" E(|γ i |) with viewing angle as per experiments with bias are shown in Figure 15 (Panels a -c) and for photon noise only (Panel d). The "optimal" situation of no photon noise and no bias is the x = y line. Similar to above, the addition of bias produces deviations from the ideal case, but for the bias cases (Panels a -c), the only unique signature is the spread at large viewing angle and a slight difference in shape. To summarize the performance when both bias and noise are included, we present Figure 16, where we focus on the Gini coefficient G as the bias levels are varied. The black "no noise" cases reflect the curves in Figure 15 a -c, the other curves are as labeled. The impact of including both noise and bias is to increase the departure from G = 0 overall and bring even negative bias toward G > 0. The latter effect is caused by effectively a "canceling out" effect between the bias and the noise. Given that the noise levels in SDO/HMI data are roughly between the "Mid Noise" and "High Noise" cases, we can possibly use these plots to rule out . Red: Experiment 1, a constant magnitude of bias at ±200 Mx cm −2 is added to B ⊥ ; Light Blue: Experiment 2, a bias that is ±25% of the total field strength; Green: Experiment 3, a bias of ±200 Mx cm −2 × μ. some extreme cases, but the curves and behaviors are similar enough to probably preclude determining a functional form of the bias.
We find that the impacts of over-or under-estimating B ⊥ do not always manifest symmetrically about the x = y line for the E(|γ i |) vs. μ relation. Furthermore, while the shapes of the curves deviate slightly from linear as both photon noise and bias are added, we are unable with this toy model to replicate the curvature observed in the SDO/HMI pipeline output ( Figure 13), even when testing a bias whose behavior is a function of observing angle, as expected by errors in fill fraction. The E(|γ i |) vs. μ behavior of Spot 2 is less curved, and closer to the curves produced by the toy model with moderate levels of positive bias. The difference in the E(|γ i |) vs. μ behavior (curved vs. closer to linear) between Spot 2 and the plage points in Figure 13 is thus likely due to unresolved structures and hence non-unity ff in the latter, and how that manifests with observing angle. Further model development to investigate the impact of unresolved structure is beyond the scope of this article (although see the approach by Leka and Barnes, 2012).

Results from Testing Mitigation Approaches
As discussed in Section 2.2.3, there are other aspects of interpreting specropolarimetric signals that can lead to differences in inferred flux densities. The results of our tests are sum-   Table 2). guity in the B ⊥ component, the same method was applied throughout, addressing one point made by Sainz Dalda (2017). G can be positive or negative, all other metrics are positive; all metrics used here tend to zero for best performance (less bias). For all metrics evaluated on time-series data, there may be evolution or field-of-view departures that result in non-zero metrics, but all of the methods are evaluated on consistent data, so comparisons between methods as grouped here are valid.
From the analysis of the AR 12457 data we find that: • Fitting multiple lines vs. only fitting one line in some experiments provided marginal improvement (cf. A1w1 vs. Aw1, A1w2 vs. Aw2) but which line was used and the data treatment also influence the magnitude of the bias (cf. the two UNNOFit results). • MERLIN and ASP-ME default, or A-DEF, results are almost indistinguishable across the metrics, as expected given the heritage of the codes. • The bias is not just a matter of the quality of the input spectropolarimetric data. Inversions of AR 12457 using SDO/HMI input data that account for ff (SyntHIA, VFISV_ABGM, UNNOFit) were by many metrics almost as good as the MERLIN and ASP-ME (default configuration and others) results using Hinode/SP input spectral data. • Optimization method can matter: generally, genetic-algorithm minimization performed slightly better than minimization by least-squares, (cf. Aw1 vs. ALSw1, A-DEF vs. ALS, Aw1 vs. ALSw1), but this is not a strong result in these tests. • Weighting differences can influence the results; equal weighting (Aw3) performed the worst (cf. Aw3 vs. Aw1, Aw2, and Aw4). • Explicitly fitting for the fill fraction provides by far the most significant impact to reduce the bias, (cf. ABGM vs. PIPE, A-DEF vs. ANF, ALS vs. ANFLS).
From the analysis of the SDO/HMI time-series data (of both plage and spots), we find that: • The bias clearly manifests in all three local components of the field, confirming that bias in B ⊥ contaminates the determination of the true magnetic vector. When the impact of the B ⊥ bias is high (cf. PIPE_720s_Pl1 vs. ABGM_720s_Pl1 results), it is generally high for all metrics across all three B h x -, B h y -, B h z -components. • The bias can manifest in strong / pixel-filling regions as well as unresolved features.
Focusing on *_Sp1 and *_Sp2, while we may expect some evolutionary changes with disk passage, the metrics improve with inversions that fit for ff (cf. ABGM_720s_Sp1 vs. PIPE_720s_Sp1). • Reducing the photon noise (cf. *720s vs. *5760s) provides a small, but not significant mitigation. Confirming the results of Pevtsov et al. (2021), random photon noise in the data is not the primary source of B ⊥ bias. • Explicitly accounting for ff = 1.0 mitigates the bias, whether through a traditional inversion approach (cf. ABGM_720s_* vs. PIPE_720s_*) or by a neural net trained on inversion output that itself explicitly accounted for ff = 1.0 (cf. SYNTHIA_720s_* vs. PIPE_720s_*).

Post-Facto Quick Fix
Using the assumption that plage is statistically dominated by a radially directed field and the fill fraction ff is not explicitly set or is otherwise already multiplied-through in the equations below, then we use the most probable image-plane inclination P(|γ i |) = θ , θ being the observing angle (Svalgaard, Duvall, and Scherrer, 1978;Wang and Sheeley, 1992;Leka, Barnes, and Wagner, 2017), as the basis for evaluation and, now, for correction. This "quickfix" approach is straightforward (see also Rudenko and Dmitrienko, 2018), but should be Figure 19 Gini coefficient of the distribution of |γ i | as f (μ), where the accumulated six maps were combined and then points binned according to μ. Gini coefficients within ±0.05 should show minimal bias.

Figure 20
Boxes: the mean absolute separation of the most-probable values for the SDO/HMI target patches and inversion options as indicated (x-axis labels are introduced in Tables 2, 3, plus the hmi.S_5760s-series integration option), computed for their disk transit; error bars: the standard deviation of the signed separation, which will be larger if it changes sign, for example. Note that the sunspots are unipolar, which will influence the results.
implemented when concerned with the large-scale averages or contributions across wellmeasured (sufficient polarization) pixels. The resulting values are not claimed to be "correct" any more than any inversion results are, just less influenced by the bias. The approach is as follows for under-resolved plage areas: i) Identify plage structures sampled across a range of observing angles μ = cos(θ ) (with full-disk data or with a temporal sequence of statistically consistent structures. ii) Examine the distributions of E(|γ i |) as f (μ = cos(θ )) and determine a simple functional form for the systematic angle difference |γ i | = E(|γ i |, μ) − μ in appropriate units. iii) Assuming that B should not change, and working initially with the absolute inclination |γ i | and assigning a new inclination |γ i | = |γ i | − |γ i |, we find  (2) where the γ i is the functional form of the deviation of the most probable image-frame inclination from the expected inclination as a function of observing angle, and each pixel is thus "corrected" according to field strength, inclination angle, this function, and the observing location. This formulation can of course be presented in a number of ways, but the important aspects are that i) if |γ i | → 0 that |B| is recovered, and ii) the results do not become infinite for, e.g., |γ i | ≈ 0 (if it performs badly near |γ i | ≈ π/2. that will statistically be less of an issue). Once the total field strength |B| is adjusted in this way, we work backwards for all points to find their new inclination angle and new components: where note that in the last two steps, we now use and recover the full [0, π] range of γ i . Again, B remains the same.
As a demonstration, we find the coefficients of a second-order polynomial fit for |γ i | to μ from full-disk SDO/HMI data for April 2010, June 2010, and August 2010. A thirdorder fit the sampled regions well, but extrapolated poorly for disk-center corrections. We then apply the above to the Plage 3 time-series extraction (Figures 3, 9). A first try resulted in an over-correction, diagnosed using a most-probable |γ i |(μ) plot similar to Figure 13, but a simple 25% reduction in the coefficient magnitudes produced a narrow distribution centered well on the x = y line. The disambiguation was then performed and the same diagnostics are used to evaluate this post-facto approach. The relevant metrics, cf. the results for PIPE_720s_Pl3 in Figures 20 -22, are listed in the caption for Figure 23 where we show the new time-series plots to compare to Figure 9.

Concluding Remarks
The interpretation of spectropolarimetric signals in terms of the magnitude, orientation, behavior, and location of magnetic fields in the solar atmosphere is simply not straightforward. How to approach the problem depends severely on the scientific question being posed. With regards to interpreting the large-scale vector field of under-resolved features, it is now wellestablished that how the spectropolarimetric signals are handled can contribute to a systematic bias in the component perpendicular to the line-of-sight: B ⊥ .
Previous works on this topic have illustrated that the problem exists and touched on some compute-intensive approaches to mitigation. In this work, we perform four novel investigations and point to potentially less-arduous mitigation approaches.
First, we develop quantitative metrics to measure the B ⊥ bias. At least one, the Ginicoefficient of the departure from expected image-plane inclination angles in plage features, is applicable without performing the 180 • disambiguation, given sufficient sampling of appropriate structures across observing angle. These quantitative metrics are used to point out that the bias in SDO/HMI data is not limited to the unresolved plage features; some bias is seen in the strong-field unity-fill-fraction pixels of sunspots. The quantitative metrics are then available to evaluate and compare data.
Second, we systematically compared the results of different inversion implementation options and targets (different spectral lines used), evaluating the results using common observational targets and the quantitative metrics developed above. We find that the most impactful implementation choice is to include ff as an independent parameter in the optimization, or having trained a neural net on such data. Applying inversions that explicitly fit ff mitigates bias in plage and in strong-field pixel-filled sunspot centers. That being said, which optimization scheme is used, the weighting used to calculate χ 2 , and the use of multiple spectral lines also impact the outcomes. We also tested a direct comparison of two very different data sources (Hinode/SP vs. SDO/HMI for NOAA AR 12457), using both pipeline and custom inversions, and found that data source per se did not account for the bias.
Third, we construct a simple "toy" model that is appropriate to test certain observed features of the B ⊥ bias. Experiments were performed that added bias with different functional forms as well as varying the level of photon noise in the data. We show that noise is not the primary contributor to the of the B ⊥ bias. We could not, however, reconstruct a distinct non-linear aspect visible in the SDO/HMI |γ i |(μ) behavior. The source and contribution function to the bias are more complex than our experiments. Further refinement to the models to determine the source of this non-linear shape is beyond the scope of this article.
Fourth and finally, we demonstrate a straightforward "quick fix" that can be applied for analysis of global plage structures on a statistical basis and can be applied prior to performing the disambiguation step. This post-facto algorithm (see also Rudenko and Dmitrienko, 2018), should not take the place of a more robust inversion, and should probably not be used to interpret any individual pixel's results quantitatively. However, it can be used to produce vector-field data for which the bias is adequately removed so as to produce more reasonable global B φ , B θ , B r maps.
We show that there are in fact viable options for more robust full-disk inversions of SDO/HMI data: UNNOFit (Bommier et al., 2007), the new VFISV_ABGM (Griñón-Marín et al., 2021), and SyntHIA (Higgins et al., 2022). The first two are more traditional implementations of Milne-Eddington codes that explicitly include fill fraction in the optimization, the latter is a neural-net trained on Hinode/SP Level-2.0 output and SDO/HMI hmi.s_720s [I , Q, U , V ] input. Given the vast differential in computing resources required, the latter may be a more readily available solution, especially for large datasets.
to the corresponding author. Data from VFISV_ABGM are in series su_abgm.ME_720s_fd10_forKD2 available through jsoc2.stanford.edu/ajax/lookdata.html, which requires interested parties to contact the SDO/HMI team for access. Data from SyntHIA can be made available upon reasonable request, see also Higgins et al. (2022) and sources therein; K.D. Leka, E.L. Wagner, and R.E.L. Higgins are actively involved in a project to establish the SyntHIA as a publicly accessible series through JSOC. The UNNOFit inversion code is available at lesia.obspm.fr/perso/veronique-bommier.

Declarations
Compliance with Ethical Standards The authors certify that there are no conflicting interests to disclose. The authors have no relevant financial or non-financial interests to disclose; all funding sources are acknowledged. K. D. Leka is on this journal's Advisory Board, which is not an editorial position, hence there is no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.