1 Introduction

Decay constants of the octet of light pseudoscalar mesons have a deep connection to the spontaneous breaking of chiral symmetry. Within the standard framework of \(SU(3)_L\times SU(3)_R\) chiral perturbation theory (\(\chi \)PT) [1], the effective theory of quantum chromodynamics at low energies, the decay constants are directly connected to the renormalization and diagonalization of the kinetic part of the Lagrangian.

Starting from the effective generating functional

$$\begin{aligned} e^{i Z_{{\rm eff}}[v,a,s,p\,]}\ &=\ \int {\mathcal {D}}U\ e^{i\int {\rm d}^4x\ {\mathcal {L}}_{{\rm eff}}[U,v,a,s,p\,]}, \end{aligned}$$
(1)
$$\begin{aligned} U(x)\ &=\ {\rm exp}{\left( \frac{i}{F_0}\,\phi ^a(x)\lambda ^a \right) }, \end{aligned}$$
(2)

where \(\phi ^a(x)\) are the pseudo-Goldstone boson fields collected in a matrix field U(x), one can obtain the connected n-point Green functions \(G^{a_1 \dots a_n}_{P_1 \dots P_n}(p_1,\dots p_n)\) as on-shell residues of the Fourier transformed functional derivatives of \(Z_{{\rm eff}}[v,a,s,p\,]\) with respect to the axial vector sources \(a_i\). \(P_1 \dots P_n\) are the pseudo-Goldstone bosons in the in- and out-states with momenta \(p_1,\dots p_n\). One then finds the following relation between the Green functions and the elements of the scattering matrix \(A_{P_1 \dots P_n}(p_1,\dots p_n)\) [2]

$$\begin{aligned} G^{a_1 \dots a_n}_{P_1 \dots P_n}(p_1,\dots p_n) = F^{a_1}_{P_1} \dots F^{a_n}_{P_n}A_{P_1 \dots P_n}(p_1,\dots p_n). \end{aligned}$$
(3)

\(F^{a_i}_{P_i}\) are the generalized decay constants which might in general include mixing terms. They correspond to the renormalization of the external legs of Feynman diagrams. Due to the mixing, it is necessary to distinguish between the on-shell particles in the in- and out-states \(P_1 \dots P_n\) and the fields \(\phi _{a_i}\), coupled to the axial-vector currents.

The leading order (LO) effective Lagrangian has the form

$$\begin{aligned} {\mathcal {L}}_{{\rm eff}}^{(2)}\ =\ \frac{F^2_0}{4} {\rm Tr}[D_{\mu }U D^{\mu }U^+ + (U^+ \chi + \chi ^+ U)]. \end{aligned}$$
(4)

Here

$$\begin{aligned} \chi =2B_0\,{\rm diag}(m_u,m_d,m_s), \end{aligned}$$
(5)

in the case when the scalar external sources are taken to be the quark masses. There is no mixing in the kinetic part of the effective Lagrangian at the leading order and thus it’s straightforward to see that all the decay constants are equal to the low energy constant \(F_0\)—the fundamental order parameter of the broken chiral symmetry.

At the next-to-leading order (NLO), taking here the part containing the low-energy coupling constants (LECs) \(L_1\dots L_{10}\)

$$\begin{aligned} {\mathcal {L}}_{{\rm eff}}^{(4)}(L_1\dots L_{10})&= L_1\,{\rm Tr}[D_{\mu }U^+ D^{\mu }U]^2\\&\quad + L_2\,{\rm Tr}[D_{\mu }U^+ D_{\nu }U]\,{\rm Tr}[D^{\mu }U^+ D^{\nu }U]\\&\quad + L_3\,{\rm Tr}[D_{\mu }U^+ D^{\mu }U D_{\nu }U^+ D^{\nu }U]\\&\quad + L_4\,{\rm Tr}[D_{\mu }U^+ D^{\mu }U]\,{\rm Tr}[\chi ^+ U + \chi \,U^+]\\&\quad + L_5\,{\rm Tr}[D_{\mu }U^+ D^{\mu }U(\chi ^+ U + U^+\chi )]\\&\quad + L_6\,{\rm Tr}[\chi ^+ U + \chi \,U^+]^2\\&\quad + L_7\,{\rm Tr}[\chi ^+ U - \chi \,U^+]^2\\&\quad + L_8\,{\rm Tr}[\chi ^+ U\chi ^+ U + \chi \,U^+\chi \,U^+]\\&\quad + iL_9\,{\rm Tr}[F^{\mu \nu }_R D_{\mu }U D_{\nu }U^+ + F^{\mu \nu }_L D_{\mu }U^+ D_{\nu }U]\\&\quad + L_{10}\,{\rm Tr}[U^+ F^{\mu \nu }_R U F^L_{\mu \nu }], \end{aligned}$$
(6)

\(\pi ^0\)\(\eta \) mixing occurs in the kinetic part of the effective action as an isospin breaking effect, inversely proportional to the difference of light quark masses \(m_d-m_u\). If the \(\pi ^0\)\(\eta \) mixing is neglected, the only two terms contributing to the renormalization of the kinetic part are the ones proportional to the LECs \(L_4\) and \(L_5\).

As can be seen, the sector of decay constants in the isospin limit can be viewed as the simplest self-contained subsystem of the theory, only involving two LECs at the leading order (\(F_0\), \(B_0\)) and two at the next-to-leading order (\(L_4\), \(L_5\)). Quite intriguingly, none of these four constants is known with very high certainty. As will be shown in more detail in what follows, at leading order a significant suppression of the order parameters, compared to the two-flavour values, is still possible or even probable, given the recent results from phenomenology [3] and lattice QCD [4]. At next-to-leading order, the constant \(L_4^r\) is expected to be small due to its suppression in the limit of large number of colours [5], but if it is indeed the case is still unknown [3]. Depending on the above, the value of \(L_5^r\) can also vary widely [3, 6].

The motivation of our work is to investigate this segment of \(\chi \)PT by Bayesian statistical methods, the question being how much can be told about the low-energy coupling constants just by restricting ourselves to this sector. We do not neglect the higher orders but treat them as a source of statistical uncertainty, thus avoiding the large number of LECs appearing at next-to-next-leading order (NNLO) [7]. The framework of ‘resummed’ \(\chi \)PT [8] is very well suited for such an approach.

Naturally, such a task might have been accomplished long ago, if the values of all the decay constants were known with sufficient precision. While that has indeed been the case for the pions and the kaons, the value of the decay constant of the \(\eta \) meson was considered to be very uncertain due to its strong mixing with \(\eta '\). In fact, in \(\chi \)PT calculations the \(\eta \) decay constant has been usually treated using its chiral expansion, not as an independent observable [2, 9]. Our crucial input is thus a recent calculation of the \(\eta \)\(\eta '\) sector on lattice QCD by the RQCD collaboration [10], which allows us to derive the SU(3) decay constant \(F_\eta \) with some confidence.

This work is a continuation of our initial inquiries [11, 12], with the major new ingredients being the updated input for \(F_\eta \) from lattice QCD and Bayesian statistical analysis, implemented in a numerical way. It can also be noted that the \(\eta \) decay constant has not been used as input for the purpose of extraction of the low-energy parameters of the SU(3) \(\chi \)PT until now.

The paper is organized in the following way—Sect. 2 provides a concise summary of our theoretical framework, while Sect. 3 introduces the sector of decay constants and connected phenomenology in a more detailed way. Our implementation of the Bayesian statistical analysis is outlined in Sect. 4, while the employed assumptions are discussed in Sect. 5. Section 6 then presents the results of the paper, which are subsequently summarized in Sect. 7.

2 Resummed \(\chi \)PT

We use an approach to chiral perturbation theory, dubbed ‘resummed’ \(\chi \)PT [2, 8, 13], which was proposed as a way to accommodate the possibility of an irregular convergence of the chiral expansion. Such a scenario might occur if some of the leading order LECs (\(F_0\) or \(B_0\)) were suppressed to a sufficient degree, so that the leading order was not dominant in the chiral expansion. In such a case, the chiral series should be handled carefully, as unexpectedly large higher orders might result from reordering of the expansion. In our case, we will assume a large range of possible values of the leading order constants, so various scenarios are naturally possible.

Let us summarized the procedure in a few points:

  • We use the standard \(\chi \)PT Lagrangian (4, 6), based on the usual power counting \(m_{q}\sim O(p^{2})\) [14].

  • Expansions of quantities related linearly to Green functions of QCD currents are trusted (“safe observables”). We assumed that for these expansions the NNLO and higher-order terms are reasonably small, though not necessary negligible. Leading order terms are not required to be dominant.

  • The expansions are expressed explicitly to next-to-leading order, all higher-order contribution are summed into higher-order remainders. Thus for an observable A the ‘resummed’ chiral expansion has the form

    $$\begin{aligned} A = A^{(LO)}+A^{(NLO)}+A\delta A, \qquad \delta A \ll 1 \end{aligned}$$
    (7)
  • These higher-order remainders will not be neglected, but estimated and treated as sources of error. In general, they might have a non-trivial analytical structure, though this is not the case for the decay constants. All higher-order LECs are effectively contained in the remainders, the large number of NNLO constants is thus traded off for a relatively smaller number of remainders.

3 Decay constants

Decay constants of the light pseudoscalar meson nonet, consisting of the pions, kaons, \(\eta \) a \(\eta '\), can be introduced in terms of the QCD axial-vector currents

$$\begin{aligned} i p_{\mu } F_P^a = \langle \,0\,|\,A_{\mu }^a(0)\,|\,P,p\,\rangle , \end{aligned}$$
(8)

where \(A_{\mu }^a={\bar{q}}\gamma _\mu \gamma _5 \lambda ^a q\). The pion and kaon decay constants take a straightforward form in the isospin limit and their values are very well established from either experimental data or lattice QCD calculations [4, 15]. In contrast, \(\eta \) nad \(\eta '\) decay constants were not very well known, until quite recently, due to significant mixing. A lot of theoretical and phenomenological work has thus been devoted to the \(\eta \)\(\eta '\) sector, see, e.g.,[16,17,18,19,20,21,22,23,24]. This list, far from exhaustive, includes investigations of the sector in the \(U(3)_L\times U(3)_R\) large \(N_c\) framework as well as phenomenological studies, aiming to extract the values of the decay constants and related mixing angles from experimental inputs. The results of the phenomenological studies span quite a range of values, some of which are not compatible with others (see [10] for a detailed overview).

Masses of the \(\eta \) and \(\eta '\) mesons in a scheme with a single mixing angle were obtained in lattice QCD simulations around a decade ago [25,26,27]. However, until recently, to our knowledge only the EMT collaboration [28] has attempted to calculate the full sector of mixing parameters, which they did in the quark flavour basis. Finally, as already mentioned, a comprehensive study of the sector, which goes down to the physical pion mass, has now been published by the RQCD collaboration [10].

The SU(3) decay constant \(F_\eta \), which is the point of our interest, is defined identically to \(F_\eta ^8\) in (8) and can therefore be related to the mixing parameters in the U(3) octet-singlet basis

$$\begin{aligned} F_\eta = F_{\eta }^8 = F_8\cos \vartheta _8. \end{aligned}$$
(9)

For the purpose of this work, our main input will be the recent lattice QCD determination of \(F_\eta ^8\) by the RQCD collaboration [10]

$$\begin{aligned} F_\eta ^8 = (1.123 \pm 0.035)F_\pi \quad {\rm (RQCD21)}. \end{aligned}$$
(10)

For comparison, we will also use two model dependent results from phenomenology

$$\begin{aligned} F_\eta ^8&= (1.18 \pm 0.02)F_\pi \quad {\rm (EGMS15)} \end{aligned}$$
(11)
$$\begin{aligned} F_\eta ^8&= (1.38 \pm 0.05)F_\pi \quad {\rm (EF05).} \end{aligned}$$
(12)

Here, EGMS15 [22] is a more recent determination which is representative of lower values of this observable, better compatible with (10). On the other hand, EF05 [19] lies on the opposite end of the spectrum and is an example of a very high value of \(F_\eta ^8\). Reported uncertainties are quite low in both cases and thus these results are essentially incompatible with each other.

In the framework of \(SU(3)_L\times SU(3)_R\) chiral perturbation theory [1], the chiral expansion of the pseudoscalar meson octet in the isospin limit can be written in the following way [11]

$$\begin{aligned} F_{\pi }^2&= F_0^2( 1-4\mu _{\pi }-2\mu _K) \nonumber \\&\quad + \, 8 m_\pi ^2 \left( L_4^r(r+2)+L_5^r \right) + F_{\pi }^2\delta _{F_{\pi }} \end{aligned}$$
(13)
$$\begin{aligned} F_K^2&= F_0^2\left( 1-\frac{3}{2}\mu _{\pi }-3\mu _K-\frac{3}{2}\mu _{\eta }\right) \nonumber \\ &\quad\, +\, 8 m_\pi ^2 \left( L_4^r(r+2)+\frac{1}{2}L_5^r(r+1)\right) +F_K^2\delta _{F_K} \end{aligned}$$
(14)
$$\begin{aligned} F_{\eta }^2&= F_0^2(1-6\mu _K)\,\nonumber \\&\quad\, +\, 8 m_\pi ^2 \left( L_4^r(r+2)+\frac{1}{3}L_5^r(2r+1)\right) +F_{\eta }^2\delta _{F_{\eta }}. \end{aligned}$$
(15)

This form is obtained directly from the generation functional of two-point Green functions in the logic of ‘resummed’ approach to \(\chi \)PT [8]. A strict form of the chiral expansion is used, where the original parameters of the Lagrangian are retained, thus avoiding any reordering of the series. \(F_P^2\delta _{F_P}\) are the sum of all higher orders, the higher-order remainders, which are not neglected. These effectively contain all low-energy coupling constants at higher orders. It should be also noted that in this case the remainders are real constants with no analytical structure and no scale dependence.

Chiral logarithms are denoted as

$$\begin{aligned} \mu _P = \frac{m_{{\rm P}}^2}{32\pi ^2F_0^2} \ln \left( \frac{m_{{\rm P}}^2}{\mu ^2}\right) , \end{aligned}$$
(16)

where \(m_{{\rm P}}\) are the pseudoscalar masses at leading order. In particular

$$\begin{aligned} m_\pi ^2&= 2B_0{\hat{m}},\nonumber \\ m_K^2&= B_0{\hat{m}}(r+1),\nonumber \\ m_\eta ^2&= \frac{2}{3}B_0{\hat{m}}(2r+1), \end{aligned}$$
(17)

with

$$\begin{aligned} {\hat{m}} = \frac{m_u+m_d}{2},\quad r = \frac{m_s}{{\hat{m}}}. \end{aligned}$$
(18)

As can be seen from (1315), chiral expansions of the decay constants up to next-to-leading order does indeed depend only on the two leading-order and two next-to-leading order LECs – \(F_0\), \(B_0\) and \(L_4^r\), \(L_5^r\), respectively. The sector of decay constants can thus be considered as a simple, self-contained system, which can be investigated on its own.

For convenience, we introduce a reparametrization of the chiral order parameters \(F_0\) and \(B_0\)

$$\begin{aligned} \begin{aligned} X&= \frac{2\,{\hat{m}}F_0^2 B_0}{F_{\pi }^2M_{\pi }^2}\equiv \frac{2\,{\hat{m}}\Sigma _0}{F_{\pi }^2M_{\pi }^2},\\ Z&= \frac{F_0^2}{F_\pi ^2},\\ Y&= \frac{X}{Z} = \frac{2B_0{\hat{m}}}{M_\pi ^2} = \frac{m_\pi ^2}{M_\pi ^2}, \end{aligned} \end{aligned}$$
(19)

where \(\Sigma _0\) is the three-flavour chiral condensate and \(M_\pi \) is the physical pion mass. Such a reparametrization is convenient as the parameters X and Z are restricted to the range (0, 1). Furthermore, the so-called paramagnetic inequality [29] puts an upper bound in the form of the two-flavour LO LECs:

$$\begin{aligned} \begin{aligned} Z&\equiv Z(3)< Z(2),\\ X&\equiv X(3) < X(2), \end{aligned} \end{aligned}$$
(20)

where the two-flavour parameters are defined analogously to the three-flavour ones in (19).

Standard approach to the chiral perturbation series usually assumes values of X and Z reasonably close to one, with the leading order dominating the expansion. On the other hand, \(Z=0\) would correspond to a restoration of chiral symmetry, while \(X=0\) to a scenario with a vanishing chiral condensate, which also implies \(Y=0\).

The most recent NNLO standard \(\chi \)PT fit [3] provides two different sets for the NLO LECs. It is based on a large number of inputs, including \(\pi K\) and \(\pi \pi \) scattering lengths, \(K_{l4}\) form factors and pion scalar and vector form factors. It also uses the ratio \(F_K/F_\pi \) (but not \(F_\eta \)). Overall, it uses 16 input observables to fit 8+34 NLO and NNLO parameters. The main fit (BE14) fixes \(L_4^r\) by hand, in order to ensure the expected suppression in the large \(N_c\) limit [5]. FF14 (free fit) releases this constraint. Their results for \(L_4^r\) and \(L_5^r\) are (at \(\mu =770\) MeV):

$$\begin{aligned} \begin{aligned} 10^3L_4^r&\equiv 0.3,\\ 10^3L_5^r&= 1.01\pm 0.06 \qquad {\rm (BE14)}, \end{aligned} \end{aligned}$$
(21)

and

$$\begin{aligned} \begin{aligned} 10^3L_4^r&= 0.76\pm 0.18,\\ 10^3L_5^r&= 0.50\pm 0.07 \qquad {\rm (FF14).} \end{aligned} \end{aligned}$$
(22)

As can be seen, the obtained values are quite different. The difference is less pronounced for the LO LECs:

$$\begin{aligned} \begin{aligned} F_0&= 71\ {\rm MeV},\\ Y&= m_\pi ^2/M_\pi ^2 = 1.055 \qquad {\rm (BE14)} \end{aligned} \end{aligned}$$
(23)

and

$$\begin{aligned} \begin{aligned} F_0&= 64\ {\rm MeV},\\ Y&= m_\pi ^2/M_\pi ^2 = 0.937 \qquad {\rm (FF14).} \end{aligned} \end{aligned}$$
(24)

For comparison, quite different values were obtained in [30] by constructing SU(3) amplitudes in a Large \(N_c\) framework. \(F_0\) was found to be very large (\(88.1\pm 4.1\) MeV) and \(L_4^r\) compatible with zero (\((-0.05\pm 0.22)\times 10^{-3}\)).

The Flavour Lattice Averaging Group [4, 6] cites several lattice QCD determinations of \(L_4^r\) and \(L_5^r\). The last report [4] highlights the results by HPQCD [31]:

$$\begin{aligned} \begin{aligned} 10^3L_4^r&= 0.09\pm 0.34,\\ 10^3L_5^r&= 1.19\pm 0.25 \qquad {\rm (HPQCD\ 13A)} \end{aligned} \end{aligned}$$
(25)

and MILC [32]

$$\begin{aligned} \begin{aligned} 10^3L_4^r&= -0.02\pm 0.56,\\ 10^3L_5^r&= 0.95\pm 0.41 \qquad {\rm (MILC\ 10).} \end{aligned} \end{aligned}$$
(26)

The leading-order LECs have also been recently calculated on lattice by the \(\chi \)QCD collaboration [33]. Though the results have not been fully published yet, the work has been cited by the Flavour Lattice Averaging Group [4] with a favourable rating. While there are several other older determinations, for example by the MILC collaboration [32, 34, 35] or based on RBC/UKQCD [36], \(\chi \)QCD provides the first highly-rated calculation of these parameters in more than a decade, as far as we are aware of. The results were quoted by FLAG in the following form:

$$\begin{aligned} \begin{aligned}&F_0 = 67.8(1.2)(3.2)\ {\rm MeV},\\&\Sigma _0^{1/3} = (F_0^2 B_0)^{1/3} = 232.6(0.9)(2.7)\ {\rm MeV}. \end{aligned} \end{aligned}$$
(27)

We will use these values as alternative inputs for the leading-order LECs.

The purpose of this work is twofold—first, we will show that the ‘resummed’ \(\chi \)PT framework leads to a simple, but robust prediction for \(F_\eta \). Then we will use the values of \(F_\eta ^8\) (1012) as an input and use Bayesian statistical inference to obtain constraints on the higher order remainders \(\delta _{F_K},\delta _{F_{\eta }}\) and the NLO LECs \(L_4^r\) and \(L_5^r\). We will compare these results with the two versions of the fit [3] (BE14 and FF14) and lattice QCD values (HPQCD 13A [31] and MILC 10 [32]) and thus check the compatibility of the various values of \(F_\eta \) and NLO LECs.

4 Bayesian statistical analysis

We use a statistical approach based on the Bayes’ theorem [8, 37]

$$\begin{aligned} P(X_i|{\rm data}) = \frac{P({\rm data}|X_i)P(X_i)}{\int {\rm d}X_i\,P({\rm data}|X_i)P(X_i)}, \end{aligned}$$
(28)

where \(P(X_i|{\rm data})\) is the probability density function (PDF) of an explored set of theoretical parameters \(X_i\) having a specific value given some experimental data.

In the case of independent experimental inputs, \(P({\rm data}|X_i)\) is the known probability density of obtaining the observed values of the observables \(O_k\) in a set of experiments with uncertainties \(\sigma _k\) under the assumption that the true values of \(X_i\) are known, typically given as a normal distribution

$$ P({\text{data}}|{\text{X}}_{{\text{i}}} ) = \prod\limits_{k} {\frac{1}{{\sigma _{k} \sqrt {2\pi } }}} \exp \left[ { - \frac{{(O_{{\text{k}}}^{{}} \exp - O_{{\text{k}}}^{{{\text{th}}}} ({\text{X}}_{{\text{i}}} ))^{2} }}{{2\sigma _{{\text{k}}}^{2} }}} \right]. $$
(29)

\(P(X_i)\) in (28) are prior probability distributions of \(X_i\). We use them to implement theoretical assumptions, available experimental information and uncertainties connected with our parameters.

Traditionally, the prior has been understood as a degree of subjective belief. However, in our view, one does not necessarily needs to ‘believe’ in the validity of the prior in a scientific context, which we think can then be more appropriately interpreted as the quantification of available information and beyond that, the assumptions entering the analysis. Naturally, predictions might depend on the assumptions used. The Bayesian formalism allows us to straightforwardly implement a variety of assumptions and explore their consequences, which we consider to be an important feature of this approach.

In our case, we have three observables in the form of the three decay constants \(F_\pi \), \(F_K\), \(F_\eta \). We will consider the ratios of quark masses as known and thus we are left with the following free theoretical parameters:

  • leading order: Z, Y

  • next-to-leading order: \(L_4^r\), \(L_5^r\)

  • higher orders: \(\delta _{F_\pi }\), \(\delta _{F_K}\), \(\delta _{F_\eta }\).

As discussed in the next section, we will use several assumptions about these parameters, which will determine the prior distributions.

Our implementation of the Bayesian statistical analysis is numerical. It consists of two steps—first we numerically generate a large ensemble of theoretical predictions \(O^{{\rm th}}_k(X_i)\) for the decay constants, depending on the free parameters, and then we calculate the probability density functions (28), effectively using Monte Carlo integration.

5 Assumptions

For the LO LECs \(F_0\) and \(B_0\), we use similar theoretical constraints as in [37], which define our priors for these parameters. Their approximate range then is

$$\begin{aligned}&0< Y < Y_{{\rm max}} \simeq 2.5, \end{aligned}$$
(30)
$$\begin{aligned}&0< Z < Z(2) = 0.86 \pm 0.01, \end{aligned}$$
(31)
$$\begin{aligned}&0< X < X(2) = 0.89 \pm 0.01, \end{aligned}$$
(32)

where the explicit form of \(Y_{{\rm max}}\), derived in [8], is

$$ Y_{{\max }} = \frac{{8F_{K}^{2} M_{K}^{2} \left( {\delta _{{M_{K} }} - 1} \right) - 2F_{\pi }^{2} M_{\pi }^{2} \left( {r + 1} \right)^{2} \left( {\delta _{{M_{\pi } }} - 1} \right)}}{{M_{\pi }^{2} \left( {r + 1} \right)\left( {2F_{K}^{2} \left( {\delta _{{F_{K} }} - 1} \right) - F_{\pi }^{2} (r + 1)\left( {\delta _{{F_{\pi } }} - 1} \right)} \right)}}. $$
(33)

Here \(\delta _{M_\pi }\) and \(\delta _{M_K}\) are higher order remainders for the chiral expansions of the pseudoscalar masses, which we treat analogously to the remainders of the decay constants [see (40) below].

In order to calculate the prior distributions, we consider X and Z as the primary variables:

$$\begin{aligned} \begin{aligned}&P(Y,Z|{\rm data})~{\rm d}Y{\rm d}Z \\&\quad \equiv P(X,Z|{\rm data})|_{X\rightarrow ZY}~{\rm d}Y Z{\rm d}Z \\&\quad = P(X,Z|{\rm data})~{\rm d}X {\rm d}Z. \end{aligned} \end{aligned}$$
(34)

One might naturally ask, why not use Y and Z as the primary variables, as these are the free parameters in our case. That would mean using uniform distributions in the range \(0< Y < Y_{{\rm max}}\) and \(0< Z < 1\) as the a priori assumption. However, this leads to a quickly rising probability distribution for \(X=ZY\) towards zero. Quite clearly, a very small chiral condensate is not a reasonable initial expectation. On the other hand, starting with uniform distributions for \(0< X < 1\), \(0< Z < 1\) and adding \(Y < Y_{{\rm max}}\) ensures a relatively flat prior for X and a vanishing distribution for Z at \(Z=0\). That we find reasonable, as it excludes the scenario with unbroken chiral symmetry and thus a world without the pseudo-Goldstone bosons. Then by including the paramagnetic inequality (20), we obtain the set (3032). These assumptions lead to probability distributions for the priors depicted in Fig. 1.

Fig. 1
figure 1

Prior distributions based on (3034)

For the purpose of obtaining constraints on the NLO LECs \(L_4^r\) and \(L_5^r\), we will use the determination of Y from \(\eta \rightarrow 3\pi \) decays [37] as an additional assumption

$$\begin{aligned} Y\ =\ 1.44 \pm 0.32\qquad \qquad (\eta \rightarrow 3\pi ). \end{aligned}$$
(35)

This value was obtained from a Bayesian analysis of the \(\eta \rightarrow 3\pi \) decay widths of the two decay channels and the Dalitz parameter a in the charged channel. The tendency towards \(Y>1\) is ultimately tied to the very large overall experimental decay rate compared to the simple estimate at leading order, given that the isospin violating parameter R is now known with a fairly good precision from lattice QCD [4]. However, while this value is higher than the result of the fits BE14 and FF14 (2324). it is compatible and does provide us with a reasonable uncertainty range. As can be seen in Fig. 2, this input effectively excludes very low values of Y, which correspond to a significantly suppressed chiral condensate. In other words, such a scenario can be understood to be excluded by the phenomenology of the \(\eta \rightarrow 3\pi \) decays.

As an alternative, we will also use the recent result by the \(\chi \)QCD Collaboration (27), expressed in the form:

$$\begin{aligned} \begin{aligned} Y&= 0.95 \pm 0.10,\\ Z&= 0.54 \pm 0.05 \qquad {(\chi {\rm QCD21})}. \end{aligned} \end{aligned}$$
(36)

In comparison with our main input (35), \(\chi \)QCD21 also excludes high values of Y and fixes Z in a narrow range at a fairly low value. The prior distributions can be seen in Fig. 3.

Fig. 2
figure 2

Prior distributions based on (3034, 35)

Fig. 3
figure 3

Prior distributions based on (3034, 36) (\(\chi \)QCD21)

The three sets of priors introduced above should be understood as proceeding from more conservative to more restricted. The set (3034), is based on very general consideration rooted in QCD. The additional inputs (35) or alternatively (36) then supplement an assumption about the value of the leading order LECs. Here, Eq. (35) is a more conservative one, effectively only excluding very low values of Y, based on results from \(\eta \rightarrow 3\pi \) decays. It is compatible with all values quoted in Sect. 3. On the other hand, Eq. (36) uses very specific values from a recent lattice QCD study, which might be in tension with some other determinations. Our main results will be based on the more conservative assumption of the first two sets of priors, while the third one will be used to explore the consequence of assuming a particular value of the chiral order parameters, which are not yet firmly established, though.

Also, it should be noted that while the PDFs depicted on Figs. 1, 2 and 3 illustrate the form of the priors for the parameters, they are not the actual inputs. The priors are the probabilistic conditions (3036), from which the PDFs follow, but these do not capture the full information encoded in the multidimensional conditions (3036).

Furthermore, while the Bayes theorem (28) separates the probability distributions \(P({\rm data}|X_i)\), which depend on the theoretical predictions \(O^{{\rm th}}_k(X_i)\), and the priors \(P(X_i)\), it’s quite desirable to implement the priors on the level of theoretical predictions, thus effectively incorporate the priors \(P(X_i)\) into \(P({\rm data}|X_i)\), so one can examine the theoretical predictions including a realistic set of assumptions. Thus we implement the relations (3034), i.e. our default prior, when numerically generating the theoretical predictions \(O^{{\rm th}}_k(X_i)\).

As for the NLO LECs \(L_4^r\) and \(L_5^r\), where our goal is to extract constraints, we limit them to the range (at \(\mu =770\) MeV)

$$\begin{aligned} 10^{3} L_5^r&\in (0,2), \end{aligned}$$
(37)
$$\begin{aligned} 10^{3} L_4^r&\in (-0.5, 2), \end{aligned}$$
(38)

which we implement as a uniform distribution. The choice of the uniform distribution signifies the lack of preference for any particular value in the allowed range, which we think is appropriate given the range of values for these parameters available in the literature. Hence, we do not use any particular value as our prior and the results are therefore not directly dependent on previous analyses.

The allowed range was chosen in a way to cover the values of all recent determinations we are aware of, including NNLO \(\chi \)PT [3] and lattice QCD [4, 6]. The upper bound is high enough to contain the PDFs of our main results. The distribution is cut off in particular cases where a lower bound for \(L_5^r\) is obtained (see later), but in such instances we find reasonable to stick with a more conservative result.

We find no credible reason to assume \(L_5^r\) smaller than zero. Our basic assumption is that \(L_5^r\) is positive, which is consistent with available determinations, which are all clearly larger than zero.

The situation is more subtle concerning \(L_4^r\). As commented above, this constant is suppressed in the large \(N_c\) limit and some results indeed find its value close to zero or even slightly negative [4, 30]. From the paramagnetic inequalities (20) it follows that there is a critical value of \(L_4^r\) [8, 38]. For the value of r we use [see (41) below] we find:

$$\begin{aligned} {L_4^r}^{({\rm crit})} = -0.50\times 10^{-3}. \end{aligned}$$
(39)

Hence we use a lower bound \(L_4^r>-0.5\times 10^{-3}\).

We estimate the higher-order remainders statistically, based on general arguments about the convergence of the chiral series [8]. Our initial ansatz is

$$\begin{aligned} \delta _{F_P} = 0.0\pm 0.1. \end{aligned}$$
(40)

We implement this by normal distributions, therefore the remainders are limited only statistically, not by any upper bound. However, our initial analysis will provide us with constraints on the higher order remainders obtained from the data for the decay constants. Subsequently, we will reuse these constraints as priors for the determination of NLO LECs \(L_4^r\) and \(L_5^r\), thus effectively shifting the initial ansatz (40).

We use the lattice QCD average [4] for the value of the strange-to-light quark mass ratio r

$$\begin{aligned} r= 27.23 \pm 0.10. \end{aligned}$$
(41)

Finally, the inputs for the pion and kaon decay constants are [15]

$$\begin{aligned} \begin{aligned} F_\pi&= 92.32 \pm 0.09\ {\rm MeV},\\ F_K&= 110.10 \pm 0.21\ {\rm MeV}. \end{aligned} \end{aligned}$$
(42)

We use inputs from PDG [15] for the masses of the particles as well, with the experimental uncertainties being negligible compared to other sources of error.

6 Results

6.1 Prediction for \({F_\eta }\)

We will employ several ways of dealing with the system of Eqs. (1315). At the first stage, it is possible to eliminate \(F_0\), \(L_4^r\) and \(L_5^r\) by simple algebraic manipulations and thus we obtain a single equation

$$\begin{aligned} F_{\eta }^2&= \frac{1}{3}\left[4F_K^2-F_{\pi }^2 + \frac{M_{\pi }^2 Y}{16\pi ^2} \left( \ln \frac{m_{\pi }^2}{m_K^2}\,+\,(2r+1)\ln \frac{m_{\eta }^2}{m_K^2}\right) \right.\\ &\left.\quad\, +\,3F_\eta ^2\delta _{F_{\eta }} - 4F_K^2\delta _{F_K} + F_\pi ^2\delta _{F_{\pi }}\right].\end{aligned}$$
(43)

The equation depends, beyond the remainders \(\delta _{F_P}\), only on a single parameter Y and the dependence is very weak, as already noted in [8, 11]. A histogram of \(10^6\) numerically generated theoretical predictions is depicted in Fig. 4, where the default assumptions (3034) (illustrated in Fig. 1) and (4042) were used. A Gaussian fit leads to a value

$$\begin{aligned} F_\eta = 117.5\pm 9.4\ {\rm MeV} = (1.28\pm 0.10)F_\pi . \end{aligned}$$
(44)
Fig. 4
figure 4

Theoretical prediction for \(F_\eta \) (\(10^6\) points). Gaussian fit overlaid

This is an improved prediction over [11] and lies in between the values of EGMS15 (11) and EF05 (12), discussed above, while still being compatible with RQCD21 (10).

As noted, this result depends only very weakly on the value of Y and thus the choice of the prior. E.g., adding the most restricting assumption (36), illustrated in Fig. 3, leads to an almost identical prediction

$$\begin{aligned} F_\eta = 117.7\pm 9.3\ {\rm MeV} \qquad {(\chi {\rm QCD21}).} \end{aligned}$$
(45)

6.2 Higher order remainders

Next, given the weak dependence of (43) on Y, we can use RQCD21 (10), EGMS15 (11) and EF05 (12) as alternative inputs for \(F_\eta \) and employ the Bayesian statistical approach to extract information about the remainders. A contour plot with confidence levels can be found in Fig. 5, which leads to

$$\begin{aligned}&\delta _{F_K} = 0.10 \pm 0.07,\nonumber \\&\delta _{F_\eta } = -0.08 \pm 0.08,\nonumber \\&\rho = 0.71 \hspace{2cm} {\rm (RQCD21)}, \end{aligned}$$
(46)
$$\begin{aligned}&\delta _{F_K} = 0.07 \pm 0.06,\nonumber \\&\delta _{F_\eta } = -0.06 \pm 0.08, \nonumber \\&\rho = 0.85 \hspace{2cm} {\rm (EGMS15)}, \end{aligned}$$
(47)
$$\begin{aligned}&\delta _{F_K} = -0.06 \pm 0.08,\nonumber \\&\delta _{F_\eta } = 0.05 \pm 0.08,\nonumber \\&\rho = 0.64 \hspace{2cm} {\rm (EF05),} \end{aligned}$$
(48)

where \(\rho \) is the correlation coefficient. These values are compatible with the prior assumption (40). We can also compare these results with the NNLO contributions for \(F_K\) obtained in [3]

$$\begin{aligned} F_K/F_\pi&= 1 + 0.176 + 0.023\quad {\rm (BE14)}, \end{aligned}$$
(49)
$$\begin{aligned} F_K/F_\pi&= 1 + 0.121 + 0.077\quad {\rm (FF14)}. \end{aligned}$$
(50)

As can be seen, both are positive, while EF05 (48) implies a negative remainder \(\delta _{F_K}\). It should be noted, however, that the work [3] uses a different form of the chiral expansion and thus this can only be taken as an indication that lower values of \(F_\eta \) might be better compatible with the fits BE14/FF14.

Fig. 5
figure 5

Constraints on higher order remainders from (43), alternative inputs for \(F_\eta \)

In the following, we will use the relation (43) as an additional constraint, thus effectively implementing the results (4648). In this way we obtain improved priors for the higher order remainders, compared to the initial ansatz (40).

6.3 Extraction of \({L_5^r}\)

As a second step, we can algebraically eliminate \(F_0\) and \(L_4^r\) by using Eq. (13), which leads to a system of two equations for \(F_K\) and \(F_\eta \), now depending on Y, \(L_5^r(\mu )\) and the remainders \(\delta _{F_P}\). We numerically generated \(10^8\) theoretical predictions for the kaon and eta decay constants (at \(\mu =770\) MeV), shown in Fig. 6, in comparison with the data (10) and (42). Once again, the default priors (3034) has been implemented here, along with the assumptions (4042).

Fig. 6
figure 6

Theoretical predictions for \(F_K\) and \(F_\eta \) with 1\(\sigma \) CL (shaded + dashed) and 2\(\sigma \) CL (dotted) contours depicted. Horizontal lines—data from [10, 15]

Our first task is to verify the general compatibility of our ensemble of theoretical predictions with the data. As can be seen in Fig. 6, this is indeed quite clearly the case, as the predictions are compatible with the data in the whole range of values. The data for \(F_K\) are a little better compatible with higher values of \(L_5^r\), while the low value of \(F_\eta \) (RQCD21) slightly prefers lower values of \(L_5^r\). However, it is quite evident that without additional information no values of \(L_5^r\) can be excluded at statistically significant levels. For this reason, we need to employ an additional assumption about the values of the LO LECs, either (35) or (36).

In the first case, using the priors based on the additional input \(Y=1.44\pm 0.32\) (35) and the relation (43), depicted in Figs. 2 and 5, we obtain the following constraints on \(L_5^r\) by employing the Bayesian analysis. Figure 7 shows our main result, the probability density function for \(L_5^r\) using RQCD21 (10), in comparison with only using \(F_K\) as an input. Quite clearly, incorporating \(F_\eta \) into the analysis has a strong influence.

Fig. 7
figure 7

PDFs for \(L_5^r\) from \(F_K\) and \(F_\eta \) for \(F_\eta = (1.123 \pm 0.035)F_\pi \) (RQCD21)

By approximating with a normal distribution, or alternatively putting a 2\(\sigma \) CL bound, we find for all the alternative inputs for \(F_\eta \)

$$\begin{aligned} L_5^r&= (0.66\pm 0.37) \times 10^{-3} \qquad {\rm (RQCD21)},\nonumber \\ L_5^r&<1.34 \times 10^{-3}\ {{\rm at}\ 2\sigma \ {\rm CL}}, \end{aligned}$$
(51)
$$\begin{aligned} L_5^r&= (0.86\pm 0.39) \times 10^{-3} \qquad {\rm (EGMS15)},\nonumber \\ L_5^r&<1.60 \times 10^{-3}\ {{\rm at}\ 2\sigma \ {\rm CL}}, \end{aligned}$$
(52)
$$\begin{aligned} L_5^r&= (1.43\pm 0.35) \times 10^{-3} \qquad {\rm (EF05)},\nonumber \\ L_5^r&>0.78 \times 10^{-3}\ {{\rm at}\ 2\sigma \ {\rm CL}}. \end{aligned}$$
(53)

In the case of RQCD21 and EGMS15, the obtained values of \(L_5^r\) are compatible with both fits BE14/FF14 (2122) and lattice QCD calculations (2526). However, for a high value of \(F_\eta \) from EF05 (12), we obtain a lower bound for \(L_5^r\), which is incompatible with the value from the fit FF14 (22) – \(L_5^r=(0.5\pm 0.07) \times 10^{-3}\).

Alternatively, using the lattice QCD input for the LO LECs (36), depicted on Fig. 3 (\(\chi \)QCD21), we obtain

$$\begin{aligned} L_5^r&= (0.68\pm 0.42) \times 10^{-3} \qquad {\rm (RQCD21,}\chi {\rm QCD21)},\nonumber \\ L_5^r&<1.48 \times 10^{-3}\ {{\rm at}\ 2\sigma \ {\rm CL}}. \end{aligned}$$
(54)
Fig. 8
figure 8

PDFs for \(L_5^r\) from \(F_K\) and \(F_\eta \) (RQCD21), using (36) (\(\chi \)QCD21)

The probability distributions are shown in Fig. 8. As can be seen, the difference from the previous case is not really significant. It might seem surprising that dramatically restricting Y to a more narrow range (compare Fig. 3 vs.  2) actually leads to a slightly larger uncertainty, but that is a result of a weaker dependence of \(F_\eta \) on \(L_5^r\) at smaller values of Y. In other words, a larger value of Y is correlated more strongly with smaller values of \(L_5^r\).

6.4 Extraction of \({L_4^r}\)

As the last option, we will try to extract information on \(L_4^r\). We will essentially repeat the procedure from the last subsection, but in this case, we will use the Eq. (14) to eliminate \(L_5^r\), which gives us a system of two equations for \(F_\pi \) and \(F_\eta \), the free variables being Z, Y, \(L_4^r(\mu )\) and the remainders \(\delta _{F_P}\). Once again, we numerically generated \(10^8\) theoretical predictions for \(F_\pi \) and \(F_\eta \), shown in Fig. 9, using the default priors [3034] [along with (4042)]. As can be seen, while the dependence on \(L_4^r\) is markedly different for the two decay constants, our theoretical predictions are compatible with the data in the whole range of values. As in the previous case, we need to employ additional information, i.e. the priors (35) or (36).

Fig. 9
figure 9

Theoretical predictions for \(F_\pi \) and \(F_\eta \) with 1\(\sigma \) CL (shaded + dashed) and 2\(\sigma \) CL (dotted) contours depicted. Horizontal—data from [10, 15]

First, using the more conservative choice of priors for the statistical analysis based on (35) (Figs. 25), we obtained the following probability density functions for \(L_4^r\). Figure 10 depicts the full result using RQCD21 (10) and also the distribution given solely by \(F_\pi \).

Fig. 10
figure 10

PDFs for \(L_4^r\) from \(F_\pi \) and \(F_\eta \) for \(F_\eta = (1.123 \pm 0.035)F_\pi \) (RQCD21)

For all the inputs for \(F_\eta \) we get (at \(\mu =770\) MeV)

$$\begin{aligned} L_4^r&= (0.39\pm 0.36) \times 10^{-3} \qquad (F_\pi \ {\rm only)}, \end{aligned}$$
(55)
$$\begin{aligned} L_4^r&= (0.44\pm 0.37) \times 10^{-3} \qquad {\rm (RQCD21)}, \end{aligned}$$
(56)
$$\begin{aligned} L_4^r&= (0.42\pm 0.36) \times 10^{-3} \qquad {\rm (EGMS15)}, \end{aligned}$$
(57)
$$\begin{aligned} L_4^r&= (0.36\pm 0.35) \times 10^{-3} \qquad {\rm (EF05)}. \end{aligned}$$
(58)

Interestingly, in this case the strongest constraint is generated by the chiral expansion of \(F_\pi \) (13) and adding \(F_\eta \) into the analysis does not make a marked change. As can be seen, varying the input for \(F_\eta \) does not have a significant impact and all results are compatible with both the fits BE14/FF14 (2122) and lattice QCD calculations (2526).

Using the alternative priors for the leading order LECs from \(\chi \)QCD21 (36) (Fig. 3), we obtain

$$\begin{aligned} L_4^r&= (0.38\pm 0.25) \times 10^{-3} \qquad (F_\pi \ {{\rm only},\chi {\rm QCD21})}, \end{aligned}$$
(59)
$$\begin{aligned} L_4^r&= (0.46\pm 0.24) \times 10^{-3} \qquad {{\rm (RQCD21},\chi {\rm QCD21)}}. \end{aligned}$$
(60)

The PDFs can be found in Fig. 11. In contrast to \(L_5^r\), in this case the more restricted range for the LO LECs leads to somewhat smaller error bars.

One might naturally ask whether the obtained results for \(L_4^r\) and \(L_5^r\) also signify an update on the priors of Z and Y and thus could shed some light on the pattern of chiral symmetry breaking at the leading order. We have investigated this possibility, but the updated PDFs (not shown here) are not significantly different from the priors. This is not surprising, given that the results for the NLO LEC’s do not strongly depend on the alternative choice of the priors [(35) or (36)], as can be seen from comparing (51) versus (54) and (56) versus (59). The largest effect comes from excluding very low values of Y, which both cases do. In other words, the rest of the uncertainties, mainly coming from the higher-order remainders, are large enough that the chiral order parameters can’t be constrained purely from inputs for the decay constants without additional information about the higher orders.

Finally, it is also possible to illustrate the correlation naturally expected between \(L_4^r\) and Z from (1315). When setting Z by hand and restricting Y from below, we have obtained the following limits:

$$\begin{aligned} L_4^r&<0.38 \times 10^{-3}\ {{\rm at}\ 2\sigma \ {\rm CL}} \nonumber \\&\hspace{1.5cm} {{\rm (RQCD21, Y>0.8, Z=0.8)}}, \end{aligned}$$
(61)
$$\begin{aligned} L_4^r&<0.88 \times 10^{-3}\ {{\rm at}\ 2\sigma \ {\rm CL}} \nonumber \\&\hspace{1.5cm} {{\rm (RQCD21, Y>0.8, Z=0.5).}} \end{aligned}$$
(62)

The first scenario essentially corresponds to the limits provided by the \(\eta \rightarrow 3\pi \) decays [37], discussed above. As can be seen from (61), such a high value of Z (\(F_0\approx 82\) MeV) restricts \(L_4^r\) much more strongly and would be in fact incompatible with a high value \(L_4^r = (0.76\pm 0.18) \times 10^{-3}\), obtained by the fit FF14 (22). While this result might not be surprising, we are able demonstrate it quantitatively, with taking all the uncertainties into account.

Fig. 11
figure 11

PDFs for \(L_4^r\) from \(F_\pi \) and \(F_\eta \) (RQCD21), using (36) (\(\chi \)QCD21)

7 Summary

We have investigated the sector of decay constants of the octet of light pseudoscalar mesons in the framework of ‘resummed’ chiral perturbation theory. Our theoretical prediction for the SU(3) decay constant of the \(\eta \) meson is

$$\begin{aligned} F_\eta = 118.4\pm 9.4\ {\rm MeV} = (1.28\pm 0.10)F_\pi , \end{aligned}$$
(63)

which is compatible with recent determinations [10, 19, 22].

Utilizing these determinations as inputs for \(F_\eta \), we have applied Bayesian statistical inference to extract the values of next-to-leading order low-energy constants \(L_4^r\), \(L_5^r\) and higher-order remainders \(\delta _{F_K}\) and \(\delta _{F_\eta }\). \(L_5^r\) was assumed to be positive, while \(L_4^r>{L_4^r}^{(crit)}=-0.5\times 10^{-3}\).

By using the most recent lattice QCD data from the RQCD Collaboration [10], which provided us with the best estimate \(F_\eta = (1.123 \pm 0.035)F_\pi \), we have obtained our main result (at \(\mu =770\) MeV):

$$\begin{aligned}&L_4^r = (0.44\pm 0.37) \times 10^{-3} \qquad {\rm (RQCD21)},\nonumber \\&L_5^r = (0.66\pm 0.37) \times 10^{-3} \qquad {\rm (RQCD21)}, \nonumber \\&L_5^r <1.34 \times 10^{-3}\ {{\rm at}\ 2\sigma \ {\rm CL}}. \end{aligned}$$
(64)
$$\begin{aligned}&\delta _{F_K} = 0.10 \pm 0.07, \nonumber \\&\delta _{F_\eta } = -0.08 \pm 0.08 \qquad \qquad {\rm (RQCD21)},\nonumber \\&\rho = 0.71. \end{aligned}$$
(65)

These results have used conservative estimates for the priors of the low-energy constants at the leading order (\(F_0\) and \(B_0\)).

Alternatively, we have used a recent computation of the leading order LECs \(F_0\) and \(B_0\) by the \(\chi \)QCD Collaboration [33] as an additional input. Though the work has not been fully published yet, it has been cited by the Flavour Lattice Averaging Group [4]. In this case, we have obtained:

$$\begin{aligned} L_4^r&= (0.46\pm 0.24) \times 10^{-3} \qquad {{\rm (RQCD21},\chi {\rm QCD21)}}, \end{aligned}$$
(66)
$$\begin{aligned} L_5^r&= (0.68\pm 0.42) \times 10^{-3} \qquad {{\rm (RQCD21},\chi {\rm QCD21)}}, \nonumber \\ L_5^r&<1.48 \times 10^{-3}\ {{\rm at}\ 2\sigma \ {\rm CL}}. \end{aligned}$$
(67)

As our main conclusion, all these values are compatible within uncertainties with the most recent standard \(\chi \)PT fits BE14 and FF14 [3], as well as the lattice QCD computations cited by the FLAG review [4]. So quite clearly, while we independently confirm the generic range of values available in the literature, an additional source of information needs to be found in order to pin down the values of the low energy constants more precisely.

However, when testing inputs for \(F_\eta \) from phenomenology, we have found some tension if a high value of \(F_\eta = (1.38 \pm 0.05)F_\pi \) (EF05) [19] was assumed. This lead to a negative sign of the remainder \(\delta _{F_K}\), while both fits BE14 and FF14 in [3] have positive NNLO contributions for \(F_K\). In addition, such a high value of \(F_\eta \) produced a lower bound \(L_5^r<0.78 \times 10^{-3}\ ( {{\rm at}\ 2\sigma \ {\rm CL}}\)), which is incompatible with the value from the fit FF14 (\(L_5^r=(0.5\pm 0.07) \times 10^{-3}\)).