1 Introduction

The cosmological constant has been one of the long-standing issues in physics and cosmology since it was introduced by Einstein in 1917 [1] for a static universe. However, Hubble discovered in 1929 [2] that the universe is expanding. Then Einstein abandoned the cosmological constant as the “biggest blunder” of his life [3]. From 1929 until the early 1990s, most physicists and cosmologists assumed the cosmological constant to be zero. Since the vacuum energy is equivalent to the cosmological constant [4, 5], an exactly zero cosmological constant requires that the bare cosmological constant should be exactly canceled by the vacuum energy. This is a difficult problem [6,7,8,9,10] (sometimes it is called the (old) cosmological constant problem in the literature). In 1998, the accelerated expansion of the universe was discovered [11, 12], and since then dark energy has been one of the most active fields in cosmology [13,14,15,16,17,18,19,20,21,22,23]. So, the cosmological constant was revived again, since the simplest candidate of dark energy is a tiny positive cosmological constant. However, it is difficult to understand why the observable cosmological constant is about 120 orders of magnitude smaller than its natural expectation of the vacuum energy [6,7,8,9,10, 13,14,15,16,17,18]. Now, the (new) cosmological constant problem becomes the question why the non-zero cosmological constant is so tiny. It means that a fine-tuning is necessary when the bare cosmological constant is canceled by the vacuum energy [6,7,8,9,10, 13,14,15,16,17,18]. In fact, the cosmological constant is still an important topic in physics and cosmology by now.

It is commonly believed that the cosmological constant problem can only be solved ultimately in a unified theory of quantum gravity and the standard model of electroweak and strong interactions, which is still absent so far. Nevertheless, many attempts have been made in the literature. One of the interesting ideas is the so-called axiomatic approach to the cosmological constant [24]. In this approach, the cosmological constant is derived from four axioms, but the underlying physical origin (say, the theory of quantum gravity) is still unknown. It is proposed in close analogy to the Khinchin axioms in information theory. The well-known Khinchin axioms can uniquely derive the Shannon entropy, on which the entire mechanism of statistical mechanics is based (see the textbook e.g. [25]).

The Khinchin axioms in information theory [26] describe the most desirable properties an information measure I should have. Axiom K1 “fundamentality”: an information measure I only depends on the probabilities \(p_i\) (the fundamental quantities) of the events under consideration and nothing else. Axiom K2 “boundedness”: there is a lower bound for the information measure I. Axiom K3 “simplicity”: the information measure I should take the simplest description. Axiom K4 “invariance”: there is a suitable scale transformation in the space of probabilities and information measures that leaves the physical contents invariant. These four Khinchin axioms look very natural and simple. However, from such natural and simple axioms, one can uniquely fix the functional form of the Shannon information which is extremely important for the statistical mechanics [25].

Inspired by the successful Khinchin axiomatic approach to the Shannon entropy in information theory, Beck [24] proposed four axioms in close analogy to the Khinchin axioms. Axiom B1 “fundamentality”: the cosmological constant \(\Lambda \) only depends on fundamental constants of nature. Axiom B2 “boundedness”: the cosmological constant is bounded from below, \(\Lambda >0\). Axiom B3 “simplicity”: the cosmological constant \(\Lambda \) is given by the simplest possible formula consistent with the other axioms. Axiom B4 “invariance”: the cosmological constant \(\Lambda \) formed with potentially different values of fundamental parameters leaves the large-scale physics of the universe scale invariant. These four axioms look also very natural and simple. From these natural and simple axioms, Beck [24] derived the explicit form of the cosmological constant,

$$\begin{aligned} \Lambda = \frac{G^2}{\hbar ^4}\left( \frac{m_e}{\alpha }\right) ^6, \end{aligned}$$
(1)

where \(\alpha \) is the fine structure constant, G is the gravitational constant, \(\hbar \) is the reduced Planck constant, and \(m_e\) is the electron mass. Accordingly, the (observable) vacuum energy density is given by [24]

$$\begin{aligned} \rho _\Lambda =\frac{c^4}{8\pi G}\,\Lambda =\frac{G}{8\pi } \frac{c^4}{\hbar ^4}\left( \frac{m_e}{\alpha }\right) ^6, \end{aligned}$$
(2)

where c is the speed of light. Numerically, this formula yields \(\rho _\Lambda \simeq 4.0961\,\mathrm{GeV/m^3}\), which can pass the current observational constraints with flying colors. We refer to [24] for the detailed derivations.

In fact, Beck [24] is not the first and the only one who derived the cosmological constant given in Eq. (1). It was independently derived from other arguments in the literature. Using the generalized Buchdahl identity, Boehmer and Harko [27] argued that the existence of a non-negative \(\Lambda \) imposes a lower bound on the mass M and density \(\rho \) for general relativistic objects with radius \(\mathcal R\),

$$\begin{aligned} 2GM\ge \frac{\Lambda c^2}{6}\mathcal{R}^3, \quad \rho =\frac{3M}{4\pi \mathcal{R}^3}\ge \frac{\Lambda c^2}{16\pi G}. \end{aligned}$$
(3)

On the other hand, Wesson [28] argued that the mass is quantized according to the rule \(m=(n\hbar /c)\sqrt{\Lambda /3}\), and the minimum mass corresponding to the ground state \(n=1\) is given by

$$\begin{aligned} m_P=\frac{\hbar }{c}\sqrt{\frac{\Lambda }{3}}, \end{aligned}$$
(4)

which is indeed a very small mass. Boehmer and Harko [27, 29] proposed to identify the minimum mass in Eq. (3) with the one in Eq. (4), and found that the radius corresponding to \(m_P\) is given by

$$\begin{aligned} \mathcal{R}_P=48^{1/6} \left( \frac{\hbar G}{c^3}\right) ^{1/3} \Lambda ^{-1/6} \simeq 1.9\,\ell _{pl}^{2/3}\Lambda ^{-1/6}, \end{aligned}$$
(5)

where \(\ell _{pl}\) is the Planck length. Noting the radius \(\mathcal{R}_P\) is of the same order of magnitude as the classical radius of the electron \(r_e=e^2/(m_e c^2)\) (where e is the electron charge), Boehmer and Harko [29] further proposed to formally equate \(\mathcal{R}_P\) with \(r_e\) while the term of \(\mathcal{O}(1)\) is neglected, and then they found that the cosmological constant is given by [29]

$$\begin{aligned} \Lambda =\frac{\ell _{pl}^4}{r_e^6}= \frac{\hbar ^2 G^2 m_e^6 c^6}{e^{12}}=\frac{G^2}{\hbar ^4} \left( \frac{m_e}{\alpha }\right) ^6, \end{aligned}$$
(6)

in which we have used the definition \(\alpha =e^2/(\hbar c)\). Clearly, the same result given in Eq. (1) has been independently derived from completely different arguments.

The third independent approach to derive the cosmological constant \(\Lambda \) in Eq. (1) is the well-known Eddington–Dirac large number hypothesis [30,31,32,33]. Nottale in 1993 [34] (see also [24]) has written down a large number hypothesis connecting cosmological parameters with standard model parameters,

$$\begin{aligned} \alpha \frac{m_{pl}}{m_e}=\left( \frac{\Lambda ^{-1/2}}{\ell _{pl}}\right) ^{1/3}, \end{aligned}$$
(7)

where \(m_{pl}\) is the Planck mass. It is easy to check that Eq. (7) is equivalent to Eqs. (1) and (6) in fact.

We note that the cosmological constant \(\Lambda \) given in Eq. (1) is related to the fine structure constant \(\alpha \) according to \(\Lambda \propto \alpha ^{-6}\). This is interesting for us. As is well known, in the same year 1998 when the accelerated cosmic expansion was discovered, the evidence for cosmological evolution of the fine structure constant \(\alpha \) has also been found [35,36,37]. Using the absorption systems in the spectra of distant quasars, Webb et al. [35] found the first evidence for the time variation of \(\alpha \), namely \(\Delta \alpha /\alpha \equiv (\alpha -\alpha _0)/\alpha _0=(-1.1\pm 0.4)\times 10^{-5}\) over the redshift range \(0.5<z<1.6\), where \(\alpha _0\) is the present value of \(\alpha \). Three years later, they improved the evidence to \(4\sigma \), namely \(\Delta \alpha /\alpha =(-0.72 \pm 0.18)\times 10^{-5}\) over the redshift range \(0.5<z<3.5\) [36, 37]. The fine structure constant \(\alpha \) was smaller in the past, and it is not a true constant in fact. Nowadays, a time-varying \(\alpha \) has been extensively discussed in the community. There are many works on this topic in the literature [38,39,40,41, 112,113,114]. If the cosmological constant \(\Lambda \) given in Eq. (1) is right, it should also be time-varying, because \(\Lambda \propto \alpha ^{-6}\). In the literature (e.g. [42,43,44]), there exist some \(\Lambda (t)\) models already. However, most of them are written by hand, e.g. \(\Lambda \propto H^2\), \(\Lambda \propto \ddot{a}/a\), \(\Lambda \propto R_{sc}\,\), \(\Lambda \propto \rho _m\), where H is the Hubble parameter, a is the scale factor, \(R_{sc}\) is the scalar curvature, \(\rho _m\) is the density of matter. Different from the \(\Lambda (t)\) models purely written by hand, the time-varying \(\Lambda \propto \alpha ^{-6}\) given in Eq. (1) is well motivated, as is shown above.

In the present work, we are interested to study the varying cosmological constant \(\Lambda \propto \alpha ^{-6}\). In Sect. 2, we try to develop a suitable framework to model the varying cosmological constant. In Sect. 3, we consider the observational constraints on the varying \(\Lambda \) models. In Sect. 4, the possible connection between the varying cosmological constant and warm dark matter is discussed. In Sect. 5, some brief concluding remarks are given.

2 Varying cosmological constant and fine structure constant

Here, we try to develop a suitable framework to model the varying cosmological constant \(\Lambda \) given in Eq. (1). For convenience, we instead use the vacuum energy density \(\rho _\Lambda \) given in Eq. (2), which is equivalent to \(\Lambda \) in fact. Throughout this work, we use the terms “cosmological constant” and “vacuum energy” interchangeably. If the cosmological constant is varying, we have \(\dot{\rho }_\Lambda =-Q\not =0\), where a dot denotes the derivative with respect to cosmic time t. To preserve the total energy conservation equation \(\dot{\rho }_{\mathrm{tot}}+3H(\rho _{\mathrm{tot}}+p_{\mathrm{tot}})=0\), a coupling between the vacuum energy and the pressureless matter is necessary, and hence \(\dot{\rho }_m+3H\rho _m=Q\not =0\), where \(\rho _m\) is the density of pressureless matter, \(\rho _{\mathrm{tot}}=\rho _\Lambda +\rho _m\), and \(p_{\mathrm{tot}}\) is the total pressure. Note that the equation-of-state parameter (EoS) of the cosmological constant \(w_\Lambda =-1\), and the EoS of the pressureless matter \(w_m=0\). Throughout this work, we assume that only the fine structure “constant” \(\alpha \) is varying, and all the other fundamental constants \(\hbar \), G, c, \(m_e\) are true constants, i.e. they do not vary indeed. Since \(\alpha =e^2/(\hbar c)\), this means that only the electron charge e is varying. Therefore, we have \(\rho _\Lambda \propto \Lambda \propto \alpha ^{-6}\). It is easy to see that

$$\begin{aligned} \frac{\dot{\rho }_\Lambda }{\rho _\Lambda }=-6\frac{\dot{\alpha }}{\alpha }, \end{aligned}$$
(8)

and then the total energy conservation equation can be preserved according to

$$\begin{aligned}&\dot{\rho }_\Lambda =-Q=-6\frac{\dot{\alpha }}{\alpha } \rho _\Lambda , \end{aligned}$$
(9)
$$\begin{aligned}&\dot{\rho }_m+3H\rho _m=Q=6\frac{\dot{\alpha }}{\alpha } \rho _\Lambda . \end{aligned}$$
(10)

The coupling term \(Q=6\rho _\Lambda \dot{\alpha }/\alpha \not =0\) if the fine structure “constant” \(\alpha \) is varying. In this work, we consider a spatially flat Friedmann–Robertson–Walker (FRW) universe containing only the vacuum energy and the pressureless matter. \(H\equiv \dot{a}/a\) is the Hubble parameter, and \(a=(1+z)^{-1}\) is the scale factor (we have set \(a_0=1\); the subscript “0” indicates the present value of corresponding quantity; z is the redshift). In this way, we can turn the varying cosmological constant model into an interacting vacuum energy model. The vacuum energy interacts with the pressureless matter by exchanging energy between them.

Due to the interaction Q, the evolutions of \(\rho _m\) and \(\rho _\Lambda \) should deviate from the ones without interaction, namely \(\rho _m\propto a^{-3}\) and \(\rho _\Lambda =\mathrm{const}\). If the coupling term Q is given, one can derive the evolutions of \(\rho _m\) and \(\rho _\Lambda \). However, the logic can be reversed. If the deviated evolutions of \(\rho _m\) and/or \(\rho _\Lambda \) are given, we can find the corresponding interaction Q from Eqs. (9), (10), and then the evolution of \(\alpha \). Inspired by e.g. [42, 45,46,47,48], we consider two different types of models to characterize the deviated evolutions of \(\rho _m\) and/or \(\rho _\Lambda \) in the following two subsections, respectively.

2.1 Type I models

Inspired by e.g. [46,47,48], the type I models are characterized by

$$\begin{aligned} \frac{\rho _\Lambda }{\rho _m}=f(a), \end{aligned}$$
(11)

where f(a) can be any function of the scale factor a. If \(f(a)\propto a^3\), it corresponds to \(\Lambda \)CDM model whose \(\rho _\Lambda =\mathrm{const}.\) and \(\rho _m\propto a^{-3}\). From Eq. (11) and Friedmann equation \(H^2=8\pi G(\rho _\Lambda +\rho _m)/3\), we have

$$\begin{aligned} \Omega _\Lambda = \frac{f}{1+f}, \quad \Omega _m = \frac{1}{1+f}, \end{aligned}$$
(12)

where \(\Omega _i\equiv 8\pi G\rho _i/(3H^2)\) (\(i=\Lambda ,\,m\)) are the fractional energy densities of the vacuum energy and matter, respectively. Substituting \(\rho _\Lambda =\rho _m f(a)\) into Eq. (9) and using \(\dot{\rho }_m\) from Eq. (10), we find

$$\begin{aligned} Q=-H\rho _m\Omega _\Lambda \left( a\frac{f^\prime }{f}-3\right) =-H\rho _\Lambda \Omega _m\left( a\frac{f^\prime }{f}-3\right) ,\nonumber \\ \end{aligned}$$
(13)

where a prime denotes the derivative with respect to a. From Eqs. (12) and (13), we obtain

$$\begin{aligned} \frac{\dot{\alpha }}{\alpha }=\frac{Q}{6\rho _\Lambda }= & {} -\frac{H}{6(1+f)}\left( a\frac{f^\prime }{f}-3\right) \nonumber \\= & {} -\frac{EH_0}{6(1+f)}\left( a\frac{f^\prime }{f}-3\right) , \end{aligned}$$
(14)

where \(E\equiv H/H_0\). If \(f\propto a^3\), it is easy to see that \(\dot{\alpha }=0\), namely \(\alpha =\mathrm{const}.\) On the other hand, one can recast the total energy conservation equation \(\dot{\rho }_{\mathrm{tot}}+3H\rho _{\mathrm{tot}}(1+w_{\mathrm{tot}})=0\) as

$$\begin{aligned} \frac{\mathrm{d}\ln \rho _{\mathrm{tot}}}{\mathrm{d}\ln a} = -3\left( 1-\Omega _\Lambda \right) \,, \end{aligned}$$
(15)

in which we have used \(w_{\mathrm{tot}}=p_{\mathrm{tot}}/\rho _{\mathrm{tot}}=\Omega _\Lambda w_\Lambda +\Omega _m w_m\) and \(w_\Lambda =-1\), \(w_m=0\). We can integrate Eq. (15) to get

$$\begin{aligned} \rho _{\mathrm{tot}}= & {} \exp \left( \int _{\mathrm{const}.}^a\left( -3+3\Omega _\Lambda \right) \frac{\mathrm{d}\tilde{a}}{\tilde{a}}\right) \nonumber \\= & {} \rho _{\mathrm{tot},0\,} \exp \left( \int _1^a\left( -3+3\Omega _\Lambda \right) \frac{\mathrm{\mathrm{d}}\tilde{a}}{\tilde{a}}\right) , \end{aligned}$$
(16)

where \(\mathrm{const}.\) is an integration constant. Using Eqs. (12), (16) and \(H^2=8\pi G\rho _{\mathrm{tot}}/3\), we find

$$\begin{aligned} E^2\equiv \frac{H^2}{H_0^2}=a^{-3}\exp \left( \int _1^a \frac{3f}{1+f}\frac{\mathrm{d}\tilde{a}}{\tilde{a}}\right) . \end{aligned}$$
(17)

Noting \(\rho _\Lambda \propto \alpha ^{-6}\) and \(\rho _\Lambda / \rho _{\Lambda 0}=\Omega _\Lambda E^2/\Omega _{\Lambda 0}= \Omega _\Lambda E^2/(1-\Omega _{m0})\), we have

$$\begin{aligned} \frac{\Delta \alpha }{\alpha }\equiv \frac{\alpha -\alpha _0}{\alpha _0}= \left( \frac{\Omega _\Lambda E^2}{1-\Omega _{m0}}\right) ^{-1/6}-1, \end{aligned}$$
(18)

where \(\Omega _\Lambda \) and \(E^2\) are given in Eqs. (12) and (17), respectively. It is easy to check that if \(f\propto a^3\), we obtain \(\Delta \alpha /\alpha =0\), namely \(\alpha =\mathrm{const}.\) In summary, if the function f(a) is given, one can get the cosmic expansion history from Eq. (17), and the variation of the fine structure “constant” \(\alpha \) from Eqs. (18) or (14). Finally, it is worth noting that by definition (11), we obtain

$$\begin{aligned} f_0=f(a=1)=\frac{\rho _{\Lambda 0}}{\rho _{m0}} =\frac{1}{\Omega _{m0}}-1, \end{aligned}$$
(19)

which is useful to fix one of the model parameters in f(a).

2.2 Type II models

Inspired by e.g. [42, 45, 48], the type II models are characterized by

$$\begin{aligned} \rho _m=\rho _{m0}\, a^{-3+\epsilon (a)}, \end{aligned}$$
(20)

where \(\epsilon (a)\) can be any function of the scale factor a. Obviously, \(\epsilon (a)\equiv 0\) corresponds to \(\Lambda \)CDM model whose \(\rho _m=\rho _{m0}\, a^{-3}\). Substituting Eq. (20) into Eq. (10), we find that the corresponding interaction term is given by

$$\begin{aligned} Q=H\rho _m\left[ \,\epsilon (a)+a\epsilon ^\prime (a)\ln a\,\right] . \end{aligned}$$
(21)

Substituting Eqs. (20) and (21) into Eq. (9), we obtain

$$\begin{aligned} \frac{\mathrm {d}\rho _\Lambda }{\mathrm {d}a}=-\rho _{m0}\, a^{-4+\epsilon (a)} \left[ \,\epsilon (a)+a\epsilon ^\prime (a)\ln a\,\right] , \end{aligned}$$
(22)

which can be integrated to get

$$\begin{aligned} \rho _\Lambda = \rho _{m0}\,\eta (a)+\rho _{\Lambda 0}, \end{aligned}$$
(23)

where

$$\begin{aligned} \eta (a)\equiv \int _a^1 \tilde{a}^{-4+\epsilon (\tilde{a})} \left[ \,\epsilon (\tilde{a})+\tilde{a}\epsilon ^\prime (\tilde{a})\ln \tilde{a}\,\right] \mathrm{d}\tilde{a}. \end{aligned}$$
(24)

Substituting Eqs. (20) and (23) into Friedmann equation \(H^2=8\pi G(\rho _\Lambda +\rho _m)/3\), we find

$$\begin{aligned} E^2\equiv \frac{H^2}{H_0^2}=\Omega _{m0}\left[ \, a^{-3+\epsilon (a)}+ \eta (a)\,\right] +\left( 1-\Omega _{m0}\right) . \end{aligned}$$
(25)

Using Eqs. (20), (21), and (23), we have

$$\begin{aligned} \frac{\dot{\alpha }}{\alpha }=\frac{Q}{6\rho _\Lambda }= & {} \frac{EH_0}{6} \frac{\Omega _{m0}\, a^{-3+\epsilon (a)}}{1+\Omega _{m0}\left[ \, \eta (a)-1\,\right] }\nonumber \\&\times \left[ \,\epsilon (a)+a\epsilon ^\prime (a)\ln a\, \right] , \end{aligned}$$
(26)

where \(\eta (a)\) and E are given in Eqs. (24) and (25), respectively. On the other hand, using Eq. (23) and noting \(\rho _\Lambda \propto \alpha ^{-6}\), we obtain

$$\begin{aligned} \frac{\Delta \alpha }{\alpha }\equiv \frac{\alpha -\alpha _0}{\alpha _0}= \left[ \,1+\frac{\Omega _{m0}\,\eta (a)}{1-\Omega _{m0}}\, \right] ^{-1/6}-1, \end{aligned}$$
(27)

where \(\eta (a)\) is given in Eq. (24). So, if the function \(\epsilon (a)\) is given, one can get the cosmic expansion history from Eq. (25), and the variation of the fine structure “constant” \(\alpha \) from Eqs. (27) or (26). In particular, if \(\epsilon (a)\equiv 0\), it is easy to check that \(\dot{\alpha }=0\) and \(\Delta \alpha /\alpha =0\), namely \(\alpha =\mathrm{const}.\)

3 Observational constraints on the models

3.1 Observational data

In the literature, there are two kinds of observational data concerning the variation of the fine structure “constant”, namely the data of \(\Delta \alpha /\alpha \) and the data of \(\dot{\alpha }/\alpha \). To the best of our knowledge, most of the observational data are given in terms of \(\Delta \alpha /\alpha \), and only a few of the observational data are given in terms of \(\dot{\alpha }/\alpha \). On the other hand, comparing \(\dot{\alpha }/\alpha \) in Eqs. (14) and (26) with \(\Delta \alpha /\alpha \) in Eqs. (18) and (27), it is easy to see that the number of free parameters in \(\dot{\alpha }/\alpha \) is always more than the one in \(\Delta \alpha /\alpha \), namely one more free parameter \(H_0\) is required in \(\dot{\alpha }/\alpha \) while \(\Delta \alpha /\alpha \) need not. Due to the above two reasons, we only consider the observational data of \(\Delta \alpha /\alpha \) in the present work.

Here, we consider the observational \(\Delta \alpha /\alpha \) dataset given in [49], which consists of 293 \(\Delta \alpha /\alpha \) data from the absorption systems in the spectra of distant quasars. This sample includes 154 quasar absorption systems from the Very Large Telescope (VLT) in Chile, and 141 quasar absorption systems from the Keck Observatory in Hawaii. The full numerical data of these 295 quasar absorption systems are available in [50] or [51, 52]. According to [49] and the instructions of [50,51,52], two outliers (J194454+770552 at \(z_{\mathrm{abs}}=2.8433\), and J000448-415728 at \(z_{\mathrm{abs}}=1.5419\)) should be removed. Therefore, there are 293 usable data in the final dataset, over the absorption redshift range \(0.2223\le z_{\mathrm{abs}}\le 4.1798\). Note that all these 293 \(\Delta \alpha /\alpha \) data are of \(\mathcal{O}(10^{-5})\). The \(\chi ^2\) from these 293 \(\Delta \alpha /\alpha \) data is given by

$$\begin{aligned} \chi ^2_\alpha =\sum \limits _i \frac{\left[ \, (\Delta \alpha /\alpha )_{\mathrm{th},i}- (\Delta \alpha /\alpha )_{\mathrm{obs},i}\,\right] ^2}{\sigma _i^2}, \end{aligned}$$
(28)

where \(\sigma _i^2=\sigma ^2_{\mathrm{stat},i}+\sigma ^2_{\mathrm{rand},i}\) (see Sect. 3.5.3 of [49] and the instructions of [50,51,52] for the technical details of \(\sigma _\mathrm{rand}\) and the error budget). In fact, we have tested our two types of models with these 293 \(\Delta \alpha /\alpha \) data, and found that these \(\Delta \alpha /\alpha \) data can tightly constrain the model parameters in f(a) or \(\epsilon (a)\), but the constraints on the model parameter \(\Omega _{m0}\) are too loose. Therefore, other cosmological observations, for instance, type Ia supernovae (SNIa), cosmic microwave background (CMB), and baryon acoustic oscillation (BAO), are required to properly constrain the model parameter \(\Omega _{m0}\).

We further consider the Union2.1 SNIa dataset [53, 54] consisting of 580 data points, which are given in terms of the distance modulus \(\mu _\mathrm{obs}(z_i)\). The theoretical distance modulus is defined by

$$\begin{aligned} \mu _\mathrm{th}(z_i)\equiv 5\log _{10} D_L(z_i)+\tilde{\mu }_0, \end{aligned}$$
(29)

where \(\tilde{\mu }_0\equiv 42.3841-5\log _{10}h\) (h is the Hubble constant \(H_0\) in units of \(100\,\mathrm{km/s/Mpc}\)), and

$$\begin{aligned} D_L(z)=(1+z)\int _0^z\frac{\mathrm{d}\tilde{z}}{E(\tilde{z};\mathbf{p})}, \end{aligned}$$
(30)

in which \(E\equiv H/H_0\), and \(\mathbf{p}\) denotes the model parameters. The \(\chi ^2\) from 580 Union2.1 SNIa is given by

$$\begin{aligned} \chi ^2_\mu (\mathbf{p})=\sum \limits _i\frac{\left[ \, \mu _\mathrm{obs}(z_i)-\mu _\mathrm{th}(z_i)\,\right] ^2}{\sigma _{\mu _\mathrm{obs}}^2(z_i)}. \end{aligned}$$
(31)

The parameter \(\tilde{\mu }_0\) (equivalent to \(H_0\)) is a nuisance parameter, but it is independent of the data points. One can perform a uniform marginalization over \(\tilde{\mu }_0\). However, there is an alternative way. Following [55,56,57], the minimization can be made by expanding \(\chi ^2_{\mu }\) in Eq. (31) with respect to \(\tilde{\mu }_0\) as

$$\begin{aligned} \chi ^2_{\mu }(\mathbf{p})=\tilde{A}-2\tilde{\mu }_0\tilde{B} +\tilde{\mu }_0^2\tilde{C}, \end{aligned}$$
(32)

where

$$\begin{aligned}&\displaystyle \tilde{A}(\mathbf{p})=\sum \limits _i\frac{\left[ \, \mu _\mathrm{obs}(z_i)-\mu _\mathrm{th}(z_i;\tilde{\mu }_0=0,\mathbf{p})\, \right] ^2}{\sigma _{\mu _\mathrm{obs}}^2(z_i)}, \\&\displaystyle \tilde{B}(\mathbf{p})=\sum \limits _i\frac{\mu _\mathrm{obs}(z_i) -\mu _\mathrm{th}(z_i;\tilde{\mu }_0=0,\mathbf{p})}{\sigma _{\mu _\mathrm{obs}}^2(z_i)}, \\&\tilde{C}=\sum \limits _i\frac{1}{\sigma _{\mu _\mathrm{obs}}^2(z_i)}. \end{aligned}$$

Equation (32) has a minimum for \(\tilde{\mu }_0=\tilde{B}/\tilde{C}\) at

$$\begin{aligned} \tilde{\chi }^2_{\mu }(\mathbf{p})= \tilde{A}(\mathbf{p})-\frac{\tilde{B}(\mathbf{p})^2}{\tilde{C}}. \end{aligned}$$
(33)

Since \(\chi ^2_{\mu ,\,\mathrm{min}}=\tilde{\chi }^2_{\mu ,\,\mathrm{min}}\) (up to a constant), we can instead minimize \(\tilde{\chi }^2_{\mu }\), which is independent of \(\tilde{\mu }_0\).

Since using the full data of CMB and BAO to perform a global fitting consumes a large amount of computation time and power, we instead use the shift parameter R from the observation of CMB, and the distance parameter A from the measurement of BAO, which are model-independent and contain the main information of the observations of CMB and BAO [58, 59], respectively. The shift parameter R of CMB is defined by [58,59,60]

$$\begin{aligned} R\equiv \Omega _{m0}^{1/2}\int _0^{z_*} \frac{\mathrm{d}\tilde{z}}{E(\tilde{z})}, \end{aligned}$$
(34)

where the redshift of recombination \(z_*\) is determined to be 1089.90 by the Planck 2015 data [61]. On the other hand, the Planck 2015 data have also determined the observed value of shift parameter \(R_{\mathrm{obs}}\) to be \(1.7382\pm 0.0088\) [62]. The \(\chi ^2\) from CMB is given by \(\chi ^2_R=(R-R_{\mathrm{obs}})^2/\sigma _R^2\). The distance parameter A of the measurement of the BAO peak in the distribution of SDSS luminous red galaxies [63] is given by

$$\begin{aligned} A\equiv \Omega _{m0}^{1/2}\,E(z_b)^{-1/3}\left[ \frac{1}{z_b} \int _0^{z_b}\frac{\mathrm{d}\tilde{z}}{E(\tilde{z})}\right] ^{2/3}, \end{aligned}$$
(35)

where \(z_b=0.35\). In [63], the value of A has been determined to be \(0.469\,(n_s/0.98)^{-0.35}\pm 0.017\). Here the scalar spectral index \(n_s\) is taken to be 0.9741 by the Planck 2015 data [62]. The \(\chi ^2\) from BAO is given by \(\chi ^2_A=(A-A_{\mathrm{obs}})^2/\sigma _A^2\).

The total \(\chi ^2\) from the combined \(\Delta \alpha /\alpha \), SNIa, CMB and BAO data is given by

$$\begin{aligned} \chi ^2=\chi ^2_\alpha +\tilde{\chi }^2_\mu +\chi ^2_R+\chi ^2_A. \end{aligned}$$
(36)

The best-fit model parameters are determined by minimizing the total \(\chi ^2\). As in [55,56,57, 64,65,66,67,68,69,70], the \(68.3\%\) confidence level is determined by \(\Delta \chi ^2\equiv \chi ^2-\chi ^2_{\mathrm{min}}\le 1.0\), 2.3, 3.53, 4.72 for \(n_p=1\), 2, 3, 4, respectively, where \(n_p\) is the number of free model parameters. Similarly, the \(95.4\%\) confidence level is determined by \(\Delta \chi ^2\equiv \chi ^2-\chi ^2_{\mathrm{min}}\le 4.0\), 6.18, 8.02, 9.72 for \(n_p=1\), 2, 3, 4, respectively.

Fig. 1
figure 1

The 68.3% and 95.4% confidence level contours in the \(\Omega _{m0}\)\(\mu \) plane for the type I model characterized by \(f(a)=f_0\,a^\xi \) given in Eq. (37). The best-fit parameters are also indicated by the black solid point. Note that \(\mu =\xi -3\) is given in units of \(10^{-5}\). See the text for details

3.2 Observational constraints on type I models

Now, we consider the observational constraints on type I models introduced in Sect. 2.1. At first, we choose the simplest form of the function f(a), namely

$$\begin{aligned} f(a)=f_0\, a^\xi , \end{aligned}$$
(37)

where \(\xi =\mathrm{const}.\) and \(f_0\) is given in Eq. (19). In this case, it is easy to find the explicit formula of \(E^2\) in Eq. (17), namely

$$\begin{aligned} E^2= & {} a^{-3}\left[ \,\Omega _{m0}+\left( 1-\Omega _{m0}\right) a^\xi \,\right] ^{3/\xi }\nonumber \\= & {} (1+z)^3 \left[ \,\Omega _{m0} +\left( 1-\Omega _{m0}\right) (1+z)^{-\xi }\,\right] ^{3/\xi }. \end{aligned}$$
(38)

If \(\xi =3\), it reduces to \(\Lambda \)CDM model and \(\alpha =\mathrm{const}.\) [nb. Eqs. (14) and (18)]. It is natural to expect \(\xi \) will be very close to 3, since all the 293 \(\Delta \alpha /\alpha \) data given in [49,50,51,52] are of \(\mathcal{O}(10^{-5})\). So, it is convenient to introduce \(\mu =\xi -3\), and then we recast \(\xi =\mu +3\). There are two free model parameters \(\Omega _{m0}\) and \(\mu \) (which is equivalent to \(\xi \)). By minimizing the corresponding total \(\chi ^2\) in Eq. (36), we find the best-fit model parameters \(\Omega _{m0}=0.279\) and \(\mu =\xi -3=-1.366\times 10^{-5}\), while \(\chi ^2_{\mathrm{min}}=868.149\) and \(\chi ^2_{\mathrm{min}}/\mathrm{dof}=0.994\). In Fig. 1, we also present the corresponding \(68.3\%\) and \(95.4\%\) confidence level contours in the \(\Omega _{m0}\)\(\mu \) plane. The parameter \(\mu =\xi -3\) is tightly constrained to a narrow range of \(\mathcal{O}(10^{-5})\), thanks to the 293 \(\Delta \alpha /\alpha \) data of \(\mathcal{O}(10^{-5})\). From Fig. 1, we note that \(\mu =\xi -3=0\) (corresponding to \(\Lambda \)CDM model and \(\alpha =\mathrm{const}.\)) deviates from the best fit beyond \(1\sigma \), although it is still consistent with the data in \(2\sigma \) region. Thus, the varying \(\Lambda \) and \(\alpha \) are favored by the observational data.

Next, we can generalize the simplest model in Eq. (37) by allowing \(\xi =\xi (a)\) is not a constant. Similar to the well-known Chevallier–Polarski–Linder (CPL) EoS parameterization \(w=w_0+w_a(1-a)\) [71, 72], the simplest form for \(\xi (a)\) is CPL-like, namely \(\xi (a)=\xi _0+\xi _1(1-a)\). Noting that the Taylor series expansion of any (even unknown) function F(x) is given by \(F(x)=F(x_0)+F_1\,(x-x_0)+(F_2/\,2!)\,(x-x_0)^2+ (F_3/\,3!)\,(x-x_0)^3+\dots \), the CPL-like \(\xi (a)=\xi _0+\xi _1(1-a)\) can be regarded as the Taylor series expansion of \(\xi (a)\) with respect to scale factor a up to first order (linear expansion). Thus, it is well motivated to consider another type I model characterized by

$$\begin{aligned} f(a)=f_0\, a^{\xi (a)}, \quad \mathrm{and} \quad \xi (a)=\xi _0+\xi _1(1-a), \end{aligned}$$
(39)

where \(\xi _0\), \(\xi _1\) are constants, and \(f_0\) is given in Eq. (19). If \(\xi _0=3\) and \(\xi _1=0\), it reduces to \(\Lambda \)CDM model and \(\alpha =\mathrm{const}.\) [nb. Eqs. (14) and (18)]. It is natural to expect \(\xi _0\) will be very close to 3, since all the 293 \(\Delta \alpha /\alpha \) data given in [49,50,51,52] are of \(\mathcal{O}(10^{-5})\). So, it is convenient to introduce \(\mu _0=\xi _0-3\), and then we recast \(\xi _0=\mu _0+3\). There are three free model parameters \(\Omega _{m0}\), \(\xi _1\) and \(\mu _0\) (which is equivalent to \(\xi _0\)). Note that there is no analytical formula for \(E^2\) in this case, but we can get it by using numerical integration in Eq. (17). By minimizing the corresponding total \(\chi ^2\) in Eq. (36), we find the best-fit model parameters \(\Omega _{m0}=0.278\), \(\mu _0=\xi _0-3=-2.650\times 10^{-4}\), and \(\xi _1=3.460\times 10^{-4}\), while \(\chi ^2_{\mathrm{min}}=856.005\) and \(\chi ^2_{\mathrm{min}}/\mathrm{dof}=0.982\). Note that this \(\chi ^2_{\mathrm{min}}\) is significantly smaller than the one of the model with Eq. (37), namely 856.005 vs. 868.149, just at the price of adding only one free model parameter. In Fig. 2, we also present the corresponding \(68.3\%\) and \(95.4\%\) confidence level contours in the \(\mu _0\)\(\xi _1\), \(\Omega _{m0}\)\(\mu _0\), and \(\Omega _{m0}\)\(\xi _1\) planes. Both the parameters \(\mu _0=\xi _0-3\) and \(\xi _1\) are tightly constrained to the narrow ranges of \(\mathcal{O}(10^{-4})\), thanks to the 293 \(\Delta \alpha /\alpha \) data of \(\mathcal{O}(10^{-5})\). From Fig. 2, we note that \(\mu _0=\xi _0-3=0\) and \(\xi _1=0\) (corresponding to \(\Lambda \)CDM model and \(\alpha =\mathrm{const}.\)) deviate from the best fit far beyond \(2\sigma \). Thus, the varying \(\Lambda \) and \(\alpha \) are favored by the observational data.

Fig. 2
figure 2

The 68.3% and 95.4% confidence level contours in the \(\mu _0-\xi _1\), \(\Omega _{m0}\)\(\mu _0\) and \(\Omega _{m0}\)\(\xi _1\) planes for the type I model characterized by \(f(a)=f_0\,a^{\xi _0+\xi _1(1-a)}\) given in Eq. (39). The best-fit parameters are also indicated by the black solid points. Note that \(\mu _0=\xi _0-3\) and \(\xi _1\) are given in units of \(10^{-4}\). See the text for details

3.3 Observational constraints on type II models

Let us turn to the observational constraints on type II models introduced in Sect. 2.2. Obviously, the simplest type II model is given by

$$\begin{aligned} \rho _m=\rho _{m0}\, a^{-3+\epsilon }, \end{aligned}$$
(40)

where \(\epsilon \not =3\) is a constant. If \(\epsilon =0\), it reduces to \(\Lambda \)CDM model and \(\alpha =\mathrm{const}.\) [nb. Eqs. (26) and (27)]. For \(\epsilon (a)=\epsilon \not =3\), we find the analytical formulas for \(\eta (a)\) and \(E^2\) in Eqs. (24) and (25), namely

$$\begin{aligned} \displaystyle \eta= & {} \frac{\epsilon }{\epsilon -3}\left( 1-a^{-3+\epsilon } \right) \nonumber \\= & {} \frac{\epsilon }{\epsilon -3}\left[ \,1-(1+z)^{3-\epsilon }\,\right] , \end{aligned}$$
(41)
$$\begin{aligned} \displaystyle E^2= & {} \frac{\Omega _{m0}}{\epsilon -3}\left( \epsilon -3a^{-3+\epsilon }\right) +\left( 1-\Omega _{m0}\right) \nonumber \\= & {} \frac{\Omega _{m0}}{\epsilon -3}\left[ \,\epsilon -3(1+z)^{3-\epsilon }\,\right] +\left( 1-\Omega _{m0}\right) . \end{aligned}$$
(42)

There are two free model parameters, namely \(\Omega _{m0}\) and \(\epsilon \). By minimizing the corresponding total \(\chi ^2\) in Eq. (36), we find the best-fit model parameters \(\Omega _{m0}=0.279\) and \(\epsilon =0.430\times 10^{-6}\), while \(\chi ^2_{\mathrm{min}}=870.391\) and \(\chi ^2_{\mathrm{min}}/\mathrm{dof}=0.997\). In Fig. 3, we also present the corresponding \(68.3\%\) and \(95.4\%\) confidence level contours in the \(\Omega _{m0}\)\(\epsilon \) plane. The parameter \(\epsilon \) is tightly constrained to a narrow range of \(\mathcal{O}(10^{-6})\), thanks to the 293 \(\Delta \alpha /\alpha \) data of \(\mathcal{O}(10^{-5})\). From Fig. 3, we see that \(\epsilon =0\) (corresponding to \(\Lambda \)CDM model and \(\alpha =\mathrm{const}.\)) is fully consistent with the observational data (in fact it is close to the best fit). So, \(\Lambda \) and \(\alpha \) can be non-varying in the type II model characterized by Eq. (40).

Fig. 3
figure 3

The 68.3% and 95.4% confidence level contours in the \(\Omega _{m0}\)\(\epsilon \) plane for the type II model characterized by \(\epsilon =\mathrm{const}.\) given in Eq. (40). The best-fit parameters are also indicated by the black solid point. Note that \(\epsilon \) is given in units of \(10^{-6}\). See the text for details

Next, we consider another type II model characterized by a CPL-like \(\epsilon (a)\), namely

$$\begin{aligned} \epsilon (a)=\epsilon _0+\epsilon _1(1-a), \end{aligned}$$
(43)

where \(\epsilon _0\) and \(\epsilon _1\) are constants. If \(\epsilon _0=\epsilon _1=0\), it reduces to the \(\Lambda \)CDM model and \(\alpha =\mathrm{const}.\) [nb. Eqs. (26) and (27)]. As mentioned above, this CPL-like \(\epsilon (a)=\epsilon _0+\epsilon _1(1-a)\) can be regarded as the Taylor series expansion of \(\epsilon (a)\) with respect to the scale factor a up to first order (linear expansion), and hence it is well motivated. There are three free model parameters, namely \(\Omega _{m0}\), \(\epsilon _0\), and \(\epsilon _1\). Note that there are no analytical formulas for \(\eta (a)\) and \(E^2\) in this case, but we can get \(\eta (a)\) by using numerical integration in Eq. (24) and then \(E^2\) in Eq. (25) is ready. By minimizing the corresponding total \(\chi ^2\) in Eq. (36), we find the best-fit model parameters \(\Omega _{m0}=0.279\), \(\epsilon _0=6.421\times 10^{-5}\) and \(\epsilon _1=-6.962\times 10^{-5}\), while \(\chi ^2_{\mathrm{min}}=857.605\) and \(\chi ^2_{\mathrm{min}}/\mathrm{dof}=0.983\). Note that this \(\chi ^2_{\mathrm{min}}\) is significantly smaller than the one of the model with Eq. (40), namely 857.605 vs. 870.391, just at the price of adding only one free model parameter. In Fig. 4, we also present the corresponding \(68.3\%\) and \(95.4\%\) confidence level contours in the \(\epsilon _0\)\(\epsilon _1\), \(\Omega _{m0}\)\(\epsilon _0\), and \(\Omega _{m0}\)\(\epsilon _1\) planes. Both the parameters \(\epsilon _0\) and \(\epsilon _1\) are tightly constrained to the narrow ranges of \(\mathcal{O}(10^{-5})\), thanks to the 293 \(\Delta \alpha /\alpha \) data of \(\mathcal{O}(10^{-5})\). From Fig. 4, we note that \(\epsilon _0=\epsilon _1=0\) (corresponding to \(\Lambda \)CDM model and \(\alpha =\mathrm{const}.\)) deviate from the best fit far beyond \(2\sigma \). This indicates that the varying \(\Lambda \) and \(\alpha \) are favored by the observational data.

4 Varying cosmological constant and warm dark matter

In the previous sections, we turned the varying cosmological constant model into an interacting vacuum energy model. The vacuum energy interacts with the pressureless matter by exchanging energy between them. In this section, we would like to view this model from another perspective. As is shown in e.g. [73], an interacting dark energy model can be equivalent to a warm dark matter model without interaction between dark energy and dark matter, while these two different kinds of models can share both the same cosmic expansion history and growth history. To keep things simple, here we only consider the models from the side of expansion history.

Although the cold dark matter (CDM) model is very successful in many fields, it has been seriously challenged recently. We refer to e.g. [74, 75] for the detailed reviews on these challenges. Recently, warm dark matter (WDM) remarkably rose as an alternative of CDM. We refer to e.g. [76,77,78] for several comprehensive reviews. The leading WDM candidate is the keV scale sterile neutrino. In fact, the keV scale WDM is an intermediate case between the eV scale hot dark matter (HDM) and the GeV scale CDM. Unlike CDM which is challenged on the small/galactic scale, it is claimed that WDM can successfully reproduce the astronomical observations over all the scales (from small/galactic to large/cosmological scales) [76,77,78]. One of the key differences between WDM and CDM is their EoS. WDM has a fairly small but non-zero EoS, while the EoS of CDM is exactly zero. In the literature, many attempts have been made to determine the EoS of dark matter (see e.g. [79,80,81,82,83,84,85,86,87,88]), and it is found that the EoS of WDM are of \(\mathcal{O}(10^{-6})\), \(\mathcal{O}(10^{-5})\) or \(\mathcal{O}(10^{-3})\) (depending on the working assumptions and the observational data in use).

Let us come back to the starting point Eqs. (9) and (10), and view them from another perspective. In the form of Eqs. (9) and (10), the vacuum energy (whose EoS is \(w_\Lambda =-1\)) interacts with the cold dark matter (whose EoS is \(w_m=0\)) through an interaction \(Q=6\rho _\Lambda \dot{\alpha }/\alpha \not =0\). Now, we recast them as

$$\begin{aligned}&\dot{\rho }_\Lambda +3H\rho _\Lambda \left( 1+ w_\Lambda ^\mathrm{eff}\right) =0, \nonumber \\&\dot{\rho }_m+3H\rho _m\left( 1+w_m^\mathrm{eff}\right) =0, \end{aligned}$$
(44)

where

$$\begin{aligned} w_{dm}\equiv w_m^\mathrm{eff}=-\frac{Q}{3H\rho _m}, \quad 1+w_{de}\equiv 1+w_\Lambda ^\mathrm{eff} = \frac{Q}{3H \rho _\Lambda }.\nonumber \\ \end{aligned}$$
(45)

In this new form, the varying cosmological constant model becomes a model containing “dark energy” (whose EoS \(w_{de}\not =-1\)) and “warm dark matter” (whose EoS \(w_{dm}\not =0\)), but there is no interaction between them. Since the observational data concerning the time variation of \(\alpha \) are of \(\mathcal{O}(10^{-5})\), it is natural to expect that \(w_{dm}\sim 1+w_{de}\sim \mathcal{O}(10^{-5})\) or smaller.

For type I models introduced in Sect. 2.1, substituting Eq. (13) into Eq. (45), we have

$$\begin{aligned} w_{dm}=\frac{\Omega _\Lambda }{3}\left( a\frac{f^\prime }{f} -3\right) , \quad 1+w_{de}= -\frac{\Omega _m}{3}\left( a\frac{f^\prime }{f}-3\right) ,\nonumber \\ \end{aligned}$$
(46)

where \(\Omega _\Lambda \) and \(\Omega _m\) are given in Eq. (12). Thus, it is easy to find the evolutions of \(w_{dm}\) and \(w_{de}\) if f(a) is given. In the top panels of Fig. 5, we plot \(w_{dm}\) and \(1+w_{de}\) as functions of the scale factor a for the type I models with \(f(a)=f_0\,a^\xi \) in Eq. (37) and \(f(a)=f_0\,a^{\xi _0+\xi _1(1-a)}\) in Eq. (39), while the corresponding best-fit model parameters obtained in Sect. 3.2 are taken. As expected above, they are of order \(10^{-6}\) or \(10^{-5}\). Thus, the effective “warm dark matter” from type I models of the varying cosmological constant \(\Lambda \propto \alpha ^{-6}\) is fully consistent with the observational constraints on WDM (e.g. [79,80,81,82,83,84,85,86,87,88, 109]).

Fig. 4
figure 4

The 68.3% and 95.4% confidence level contours in the \(\epsilon _0\)\(\epsilon _1\), \(\Omega _{m0}\)\(\epsilon _0\), and \(\Omega _{m0}\)\(\epsilon _1\) planes for the type II model characterized by \(\epsilon (a)=\epsilon _0+\epsilon _1(1-a)\) given in Eq. (43). The best-fit parameters are also indicated by the black solid points. Note that \(\epsilon _0\) and \(\epsilon _1\) are given in units of \(10^{-5}\). See the text for details

Fig. 5
figure 5

The effective EoS of “warm dark matter” and “dark energy” given in terms of \(w_{dm}\) (solid lines) and \(1+w_{de}\) (dashed lines), for (top-left panel) the type I model with \(f(a)=f_0\,a^\xi \) in Eq. (37), (top-right panel) the type I model with \(f(a)=f_0\,a^{\xi _0+\xi _1(1-a)}\) in Eq. (39), (bottom-left panel) the type II model with \(\epsilon =\mathrm{const}.\) in Eq. (40), (bottom-right panel) the type II model with \(\epsilon (a)=\epsilon _0+\epsilon _1(1-a)\) in Eq. (43), while the corresponding best-fit model parameters obtained in Sect. 3 are taken. Note that they are given in units of \(10^{-5}\), \(10^{-6}\) or \(10^{-7}\). See the text for details

For type II models introduced in Sect. 2.2, substituting Eqs. (21), (20), and (23) into Eq. (45), we get

$$\begin{aligned}&\displaystyle w_{dm}=-\frac{1}{3}\left[ \,\epsilon (a)+ a\epsilon ^\prime (a)\ln a\,\right] ,\end{aligned}$$
(47)
$$\begin{aligned}&\displaystyle 1+w_{de}=\frac{\Omega _{m0}}{3}\frac{a^{-3+\epsilon (a)}}{1+\Omega _{m0}\left[ \,\eta (a)-1\,\right] }\left[ \,\epsilon (a)+a\epsilon ^\prime (a)\ln a\,\right] ,\nonumber \\ \end{aligned}$$
(48)

where \(\eta (a)\) is given in Eq. (24). Thus, it is easy to find the evolutions of \(w_{dm}\) and \(w_{de}\) if \(\epsilon (a)\) is given. In the case of \(\epsilon (a)=\epsilon =\mathrm{const}.\), there is an explicit formula in Eq. (41) for \(\eta (a)\). However, in the case of \(\epsilon (a)=\epsilon _0+\epsilon _1(1-a)\), we should get \(\eta (a)\) by using numerical integration in Eq. (24). In the bottom panels of Fig. 5, we plot \(w_{dm}\) and \(1+w_{de}\) as functions of the scale factor a for the type II models with \(\epsilon =\mathrm{const}.\) in Eq. (40) and \(\epsilon (a)=\epsilon _0+\epsilon _1(1-a)\) in Eq. (43), while the corresponding best-fit model parameters obtained in Sect. 3.3 are taken. Clearly, they are of order \(10^{-5}\) or \(10^{-7}\). Again, the effective “warm dark matter” from type II models of the varying cosmological constant \(\Lambda \propto \alpha ^{-6}\) is also fully consistent with the observational constraints on WDM (e.g. [79,80,81,82,83,84,85,86,87,88, 109]).

5 Concluding remarks

In this work, we considered the cosmological constant model \(\Lambda \propto \alpha ^{-6}\), which is well motivated from three independent approaches as mentioned in Sect. 1. As is well known, in the passed 18 years, the hint of varying fine structure constant \(\alpha \) was found, and the observational data of varying \(\alpha \) were accumulated. Nowadays, a time-varying \(\alpha \) has been extensively discussed in the community. If \(\Lambda \propto \alpha ^{-6}\) is right, it means that the cosmological constant \(\Lambda \) should also be varying. In this work, we tried to develop a suitable framework to model this varying cosmological constant \(\Lambda \propto \alpha ^{-6}\), in which we view it from an interacting vacuum energy perspective. We proposed two types of models to describe the evolutions of \(\Lambda \) and \(\alpha \). Then we considered the observational constraints on these models, by using the 293 \(\Delta \alpha /\alpha \) data from the absorption systems in the spectra of distant quasars, and the data of SNIa, CMB, and BAO. We found that the model parameters can be tightly constrained to the narrow ranges of \(\mathcal{O}(10^{-5})\) typically, thanks to the 293 \(\Delta \alpha /\alpha \) observational data of \(\mathcal{O}(10^{-5})\). In particular, three of four models considered in this work favor the varying \(\Lambda \) and \(\alpha \), while \(\Lambda \)CDM model and \(\alpha =\mathrm{const}.\) deviate from the best fit beyond \(2\sigma \) or at least \(1\sigma \). On the other hand, we can also view the varying cosmological constant model \(\Lambda \propto \alpha ^{-6}\) from another perspective, namely it can be equivalent to a model containing “dark energy” (whose EoS \(w_{de}\not =-1\)) and “warm dark matter” (whose EoS \(w_{dm}\not =0\)), but there is no interaction between them. We derived the effective EoS of “warm dark matter” and “dark energy”, and found that they are fully consistent with the observational constraints on warm dark matter. In summary, we consider that the varying cosmological constant model \(\Lambda \propto \alpha ^{-6}\) is viable and deserves further studies.

Some remarks are in order. First, although the cosmological constant \(\Lambda \propto \alpha ^{-6}\) is derived from three independent approaches as mentioned in Sect. 1 (see also e.g. [110, 111]), the underlying fundamental theory for it is still unknown. Varying fundamental constants require new physics [38]. It is commonly believed that the cosmological constant problem can only be solved ultimately in a unified theory of quantum gravity and the standard model of electroweak and strong interactions, which is still absent so far. Nevertheless, we consider that the studies on such a cosmological constant \(\Lambda \propto \alpha ^{-6}\) might shed new light on the possible ways to the unknown underlying theory.

Second, we have tightly constrained the model parameters besides \(\Omega _{m0}\), namely \(\xi \), \(\xi _0\), \(\xi _1\), \(\epsilon \), \(\epsilon _0\) and \(\epsilon _1\), to the narrow ranges of \(\mathcal{O}(10^{-5})\) typically, mainly by using the 293 \(\Delta \alpha /\alpha \) observational data from the absorption systems in the spectra of distant quasars. In fact, these parameters were confronted with only the observations of SNIa, CMB, and BAO in e.g. [48], and the corresponding constraints are of \(\mathcal{O}(1)\). So, the significant leap from \(\mathcal{O}(1)\) to \(\mathcal{O}(10^{-5})\) shows the great power of the 293 \(\Delta \alpha /\alpha \) observational data. In fact, these 293 \(\Delta \alpha /\alpha \) data have been used in many issues (see e.g. [89, 90]). We advocate the further uses of these \(\Delta \alpha /\alpha \) observational data in relevant studies.

Third, in addition to the well-known evidence of the time variation in the fine structure constant \(\alpha \), it was claimed that \(\alpha \) is also spatially varying [49, 91]. If \(\Lambda \propto \alpha ^{-6}\) is right, the cosmological constant should be not only time-dependent but also space-dependent. This might bring about new features to this field, and deserve further detailed studies. For example, it is claimed that there exists a preferred direction in the CMB temperature map (known as the “Axis of Evil” in the literature) [92,93,94,95], the distribution of SNIa or gamma-ray bursts [96,97,98,99,100,101,102,103,104], and the quasar optical polarization data [105,106,107,108]. If the cosmological constant \(\Lambda \propto \alpha ^{-6}\) is also space-dependent, it might be responsible for the possible anisotropy in the (accelerated) expansion of the universe. We leave this interesting issue to future work.

Fourth, besides the 293 \(\Delta \alpha /\alpha \) data from the absorption systems in the spectra of distant quasars, there are more \(\Delta \alpha /\alpha \) observational data from other types of observations, for examples, atomic clocks, Oklo natural nuclear reactor, meteorite dating, CMB, big bang nucleosynthesis. We refer to e.g. [38] for a comprehensive review. Thus, it is interesting to consider the constraints from these observational data. Since these data are subtle in some sense [38, 40, 41], we also leave this to future work.

Fifth, let us turn to the varying cosmological constant model itself. In the present work, we considered four particular parameterizations of the functions f(a) and \(\epsilon (a)\). In fact, one can instead consider other parameterizations, for instance, f(a), \(\xi (a)\) or \(\epsilon (a)\) characterized by \(c_0+c_1\ln a\) or \(c_0 a^{c_1}\) [48]. On the other hand, in the present work we assumed that only the fine structure constant \(\alpha \) is varying, and the other fundamental constants G, c, \(\hbar \), \(m_e\) do not vary indeed. However, the varying G, c, \(\hbar \), \(m_e\) models do exist in the literature (see e.g. [38] for a comprehensive review). Since the cosmological constant \(\Lambda \) given in Eqs. (1) or (2) also depends on G, c, \(\hbar \), and \(m_e\), there are diverse variants of the varying cosmological constant model in fact. These variants might bring about new features. Since this is beyond the scope of the present work, we again leave it to future work.

Sixth, it is worth noting that the data analysis based on Eq. (36) can only give a rough indication and cannot be used to infer that any of the models considered in the present work is better than \(\Lambda \)CDM (we thank Prof. Dominik Schwarz for pointing out this issue), because (a) the SNIa light curve fitters assumed \(\Lambda \)CDM. So, if one wants to fit a model other than \(\Lambda \)CDM, the light curve fitting procedure should be redone within the new model. (b) the use of the parameters R and A from CMB and BAO measurements is crude, since they cannot take some complicated effects (e.g. the integrated Sachs–Wolfe effect) into account. (c) Eq. (36) implicitly assumes that \(\Delta \alpha /\alpha \) data, SNIa, CMB, and BAO have equal statistical weight, but this is questionable. In addition, a full Markov Chain Monte Carlo (MCMC) analysis of the CMB data should be used to further test the models considered in this work (we thank Prof. Dominik Schwarz for pointing out this issue), and we leave it to future work.

Finally, it is important to clarify the two different (but equivalent) perspectives on the varying cosmological constant model considered in this work. The first perspective is to regard the varying cosmological constant as a fluid not interacting with dark matter. In fact, this is the case considered in Sect. 4. The conservation equation of this fluid is given by Eq. (44), in the form of \(\dot{\rho }_{de}+3H\rho _{de}\left( 1+w_{de}\right) =0\). In this case, we stress that the EoS of this fluid is not \(w_{de}=-1\). As is clearly shown in e.g. Fig. 5, the EoS of this fluid is time-dependent, rather than constant. Indeed, the EoS of this fluid \(w_{de}\not =-1\) in this case. So, one cannot say \(\rho _{de}=\mathrm{const}.\) We have not assumed \(w_{de}=-1\) in this perspective indeed, and we refer to Sect. 4 for detailed discussions. On the other hand, the second perspective on the varying cosmological constant model is considered in Sects. 2 and 3. In this case, one can regard the varying cosmological constant as a fluid interacting with dark matter (different from the first perspective considered in Sect. 4). Now, the conservation equation of this fluid is given by Eq. (9), in the form of \(\dot{\rho }_\Lambda +3H\rho _\Lambda \left( 1+w_\Lambda \right) =-Q\not =0\). Yes, we assumed the EoS of this fluid \(w_\Lambda =-1\) in the second perspective considered in Sects. 2, 3, and then \(\dot{\rho }_\Lambda =-Q\not =0\). However, due to the non-zero Q, again one cannot say \(\rho _\Lambda =\mathrm{const}.\) In fact, the two perspectives considered in Sect. 4 and Sects. 23 are completely independent. There is no inconsistency in each independent perspective. One should not mix up these two different perspectives considered in this work, otherwise confusion and misunderstanding might arise.