1 Introduction

Fast Radio Bursts (FRBs) are high-energy transient events with a millisecond duration and radio frequency range of a few hundred to a few thousand MHz [2,3,4,5,6]. In the past years, some models have been proposed to explain the origin of the burst, but the physical mechanism responsible for it is still in debate [7]. However, the large observed dispersion measure (DM) above that of the Milk Way suggests an extragalactic or cosmological origin for the FRBs [8]. Since their first discovery by Parkes Telescope in 2007 [9], more than one hundred FRBs have been detected thanks to new telescopes, such as e.g. the Canadian Hydrogen Intensity Mapping Experiment (CHIME, [10]).

It is a common understanding that some of their observational properties must be better understood to explore the full potential of these objects in both astrophysical and cosmological contexts. For instance, due to the spatial variation in cosmic electron distribution, the density fluctuations in the dispersion measure (DM) need to be better determined [11]. Another limitation is the poor knowledge about the host galaxy contribution of the FRBs (\(DM_{host}\)), which depends on many factors such as the galaxy type, the relative orientation between the FRB source with respect to the host as well as the mass of the host galaxy  [12]. The redshift evolution of \(DM_{host}\) remains unknown and previous works studied different functions as such as simple log-normal form with median value of 100 pc \(\mathrm {cm^{-3}}\) [13], as well as a normal or log-normal distribution with a median value as free parameter in the range 20–200 pc/\(\mathrm {cm^{3}}\) [14], among others.

When the origin of the burst is confirmed, the galaxy host can be identified, and the redshift of the event can be measured directly. In this situation, the dispersion measure can be combined with the redshift to obtain the \(DM-z\) relation [15]. From these relations, one can use FRBs to probe the anisotropic distribution of baryon matter in Universe [16], to test the weak equivalence principle [17] or to constrain cosmological parameters [18, 19], such as the Hubble constant [20,21,22] and the baryon mass fraction in the intergalactic medium (\(f_{IGM}\)) [23,24,25].

An interesting aspect regarding the \(f_{IGM}\) is the possibility of its variation with respect to redshift. In [26], the authors found \(f_{IGM} \approx 0.82\) at \(z \ge 0.4 \), while in [27] the authors estimated \(f_{IGM} \approx 0.9\) at \(z \ge 1.5\). More recently, in a previous communication [1], we used a cosmological model-independent method to constrain \(f_{IGM}\), assuming both constant and time-dependent parameterizations, and found that the time-evolution of \(f_{IGM}\) depends strongly on the DM fluctuations due to the spatial variation in cosmic electron density. Among all the previously parameters mentioned, here we focus mainly on \(DM_{host}\) and \(f_{IGM}\).

One issue when studying FRBs in cosmology is the identification of the host galaxy, and although many events have been observed in the sky, only a few FRBs in the literature are well localized, with the correspondent redshift [28]. The current FRBs sample is not large enough to perform robust statistical analysis, but instruments are being built to localize FRBs in the next few years. Among these are the coherent upgrade CRACO system [29] of Australian Square Kilometre Array Pathfinder (ASKAP), the Canadian Hydrogen Intensity Mapping Experiment (CHIME) outriggers [10] and SKA1-Mid [30]. While ASKAP/CRACO is expected to localize \(\sim 100\) FRBs per year, the number for CHIME/FRB is \(\sim 500\) FRBs per year.

In this context, understanding the constraining power of the upcoming observations through numerical simulations is, therefore, an important and necessary task. However, to perform such simulations, it is crucial to determine the redshift distribution of the FRBs. As the origin of them is unknown, it is necessary to combine astrophysical assumptions with numerical simulations to obtain such functions. The literature has explored distributions based on general aspects, such as star formation history/rate [31] or by assuming a specific astrophysical origin, such as gamma-ray bursts [32]. For a general analysis of the possible distributions, we refer the reader to [33] and references therein.

Table 1 A list of FRB with known host galaxies

In this work, we investigate the impact of different FRB redshift distributions and the number of FRB events on the constraints of \(DM_{host}\) and \(f_{IGM}\) through Monte Carlo simulations. The redshift distributions are defined from different astrophysical and cosmological assumptions, and we also consider the role of DM fluctuations on the \(DM_{host}\) and \(f_{IGM}\) estimates. We obtain the mass of baryon fraction in the IGM model-independently as presented in [1], where FRBs data from Monte Carlo simulated data are combined with type Ia supernovae (SNe) observations. Our results clearly show the crucial role of the DM fluctuations in more precisely determining the cosmological parameters from FRBs observations.

We organized this paper as follows: Sect. 2 briefly discusses FRBs properties and the main quantities. The data set used and the methodology applied are described in Sect. 3. Our simulations and results are presented in Sects. 4 and 5, respectively. We end the paper in Sect. 6 by presenting our main conclusions.

2 FRB properties

The FRB’s photons interact with the free electrons in the medium from the host galaxy to the observer on Earth. These interactions result in a change in the frequency of the pulse, thereby causing a delay in its arrival time. The time delay is proportional to DM and can be written in terms of others components [14, 34]

$$\begin{aligned} DM_{obs} (z) = \sum _{i}{DM_{i}(z)} \; \end{aligned}$$
(1)

where \(i = {{MW,ISM}}\); host; IGM; MWhalo and are the contributions from the Milky Way interstellar medium (ISM), the host galaxy, the intergalactic medium and the Milky Way halo, respectively.

The term \(DM_\mathrm{{MW, ISM}}\) can be obtained using Galactic electron density models from pulsar observations [35,36,37] whereas the halo contribution is not well constrained yet, and therefore, we follow [14] and assume \(DM_{MW,halo} = 50\) pc/cm\(^{3}\). The host galaxy contribution can be written as

$$\begin{aligned} DM_{host}(z) = \frac{DM_{host,0}}{1+z}, \end{aligned}$$
(2)

where the \((1+z)\) factor accounts for the cosmic dilation [15, 38]. The host galaxy contribution in the source frame (\(DM_{host,0}\)) is a poorly known parameter and depends on some factors, such as the type of galaxy and the inclination angle of the host galaxy. Therefore, in our analysis \(DM_{host,0}\) will be treated as a free parameter.

The IGM contribution depends on the redshift and can be written as [15]

$$\begin{aligned} DM_{IGM}(z)=\frac{3c\Omega _{b}H_{0}^{2}}{8\pi Gm_{p}} \int _{0}^{z} \frac{(1+z')f_{IGM}(z')\chi (z')}{H(z')}dz', \end{aligned}$$
(3)

where c, \(\Omega _{b}\), \(H_{0}\), G, \(m_{p}\), \(f_{IGM}(z)\), H(z) are, respectively, the speed of light, the present-day baryon density parameter, the Hubble constant, the gravitational constant, the proton mass, the baryon fraction in the IGM and the Hubble parameter at redshift z. Also, \(\chi (z) = Y_{H}\chi _{e,H}(z) + Y_{He}\chi _{e,He}(z)\) is the free electron number fraction per baryon, in which \(Y_{H} = 3/4\) and \(Y_{He} = 1/4\) are the mass fractions of hydrogen and helium, respectively, while \(\chi _{e,H}(z)\) and \(\chi _{e,He}(z)\) are the ionization fractions of hydrogen and helium, respectively. The hydrogen and helium are fully ionized at \(z < 3\) [27, 39], so that we have \(\chi _{e,H}(z) = \chi _{e,He}(z) = 1\).

In [1], we presented a cosmological model-independent method, which solves the \(DM_{IGM}\) integral above by parts, identifying one of the terms as the luminosity distance (\(d_{L}\)). We also considered two parameterizations of the baryon fraction in terms of the redshift: a constant case, \(f_{IGM} (z) = f_{IGM,0}\) and a time-dependent case, \(f_{IGM} (z) = f_{IGM,0} + \alpha z/(1+z)\). For simplicity, in the present paper we consider only the constant case, for which Eq. (3) can be written as

$$\begin{aligned} DM_{IGM}(z) = A f_{IGM,0} \left[ \frac{d_{L}(z)}{c} - \int _{0}^{z} \frac{d_{L}(z')}{(1+z')c} dz' \right] , \end{aligned}$$
(4)

being \(A = \frac{3c\Omega _{b}H_{0}^{2}}{8\pi Gm_{p}}\).

We also define \(DM_{ext}\) as the difference between the DM observed and its galactic contribution

$$\begin{aligned} DM_{ext}(z) \equiv DM_{obs}(z) - DM_{MW}\;, \end{aligned}$$
(5)

whereas the theoretical extragalactic dispersion measure (\(DM_{ext}^{th}\)) can be calculated using Eq. (1)

$$\begin{aligned} DM_{ext}^{th}(z) \equiv DM_{IGM}(z) + DM_{host}(z)\;. \end{aligned}$$
(6)

Thus, by using the above equations, we can compare theory and observations to constrain \(f_{IGM,0}\) and \(DM_{host,0}\). Following [1], the observational data points are obtained by combining the \(DM-z\) relation with \(d_{L}(z)\) estimates from SNe observations.

3 Data and methodology

There are 19 well-localized FRBs events (for details of FRBs catalogue,Footnote 1 see [40]). In our analysis, we exclude the events FRB 20191228, FRB 20190614D, FRB 20190520B and FRB 20181030A due to the following reasons: FRB 20190614D [41] has no measurement of spectroscopic redshift and can, in principle, be associated with two host galaxies. FRB 20190520B [42] has a host contribution much larger than the other FRBs, whereas FRB 20191228 [43] has the uncertainty of observed dispersion measure much larger than the others (\(\sigma _{obs} = 8\) pc/cm\(^{3}\)); and finally, there is no SNe in the Pantheon catalogue with the redshift close to FRB 20181030A [44] (\(z = 0.0039\)).

The remaining sample contains 15 FRBs with well-measured redshift, which constitutes the most up-to-date FRB data set currently available [45,46,47,48,49,50,51,52,53], and is listed in Table 1 with the observed dispersion measure (\(DM_{obs}\)), the Galaxy contribution (\(DM_{MW, ISM}\)) estimated from the NE2001 model [36], and the uncertainty of \(DM_{obs}\) (\(\sigma _{obs}\)).

The observational quantity \(DM_{ext}\) (Eq. 5) can be obtained using data from Table 1 with its uncertainty calculated by the expression

$$\begin{aligned} \sigma _{ext}^{2} = \sigma _{obs}^{2} + \sigma _{MW}^{2}\ +\delta ^{2} \;, \end{aligned}$$
(7)

where the average galactic uncertainty \(\sigma _{MW}\) is assumed to be 10 pc/cm\(^{3}\) [54] and \(\delta \) stands for the DM fluctuations due to the spatial variation in cosmic electron density. Such fluctuations can be treated as a probability distribution or as fixed value in the statistical analyses [14, 22, 55]. In this work, we will consider three different values for \(\delta = 0, 100, 200, 400, 230\sqrt{z}\) pc/cm\(^{3}\), in agreement with recent results presented in the literature [1, 11].

We obtain the luminosity distance in Eq. (4) from current SNe observations, specifically the Pantheon catalogue [56], which contains 1048 SNe within the redshift range \(0.01< z < 2.3\). The distance moduli (\(\mu (z)\)) is given by

$$\begin{aligned} \mu (z) = m_{B} - M_{B} = 5\log _{10}\left[ \frac{d_{L}(z)}{1\text{ Mpc }}\right] + 25 \;, \end{aligned}$$
(8)

where \(m_{B}\) and \(M_{B}\) are the apparent magnitude of SNe and the absolute peak magnitude, respectively. In our analysis we fix \(M_{B} = -19.214 \pm 0.037\) mag [57] or, equivalently, \(H_{0} = 74.03 \pm 1.4\) kms\(^{-1}\)Mpc\(^{-1}\). To obtain estimates of \(d_{L}(z)\) at the same redshift of the FRBs, we perform a Gaussian Process (GP) reconstruction of the Pantheon data, using GaPP python library (for details of GaPP,Footnote 2 see [58]). There are two free parameters (\(f_{IGM,0}\), \(DM_{host,0}\)) in Eq. (4), which will be constrained from the Monte Carlo Markov Chain (MCMC) analysis using the emcee package [59]. The results of our observational data analysis for \(\delta = 0, 100, 200, 400, 230\sqrt{z}\) pc/cm\(^{3}\) are displayed in Table 2.

Table 2 Estimates of the \(f_{IGM}\) and \(DM_{host,0}\) from current observational data
Fig. 1
figure 1

The normalized redshift distributions for FRBs

4 Simulations

To study the cosmological impact of a larger sample of FRBs than the one currently available, we perform a Monte Carlo simulation to generate random points of \(DM_{ext}\). For the MC simulation method, we need a redshift distribution of FRBs to generate the points, but the distribution of these bursts is still uncertain because we do not know the progenitor of these events, and for this reason many models for distribution of FRBs have been assumed. In reference [33], the authors studied the effects of nine different redshift distribution of FRBs to constrain cosmological parameters and found that three of them present strong constraining power. Thus, we will consider these three distributions, namely:

  • Gamma-Ray Bursts Several studies assume the gamma-ray bursts distribution for FRBs due to the similarity between these two events [60]. The density function is written as

    $$\begin{aligned} P_\mathrm{{GRB}}(z) \propto z \exp {(-z)}. \end{aligned}$$
    (9)
  • Star Formation Rate The star formation rate distribution was proposed by [61] (see also reference [62] for the first proposal of redshift distribution for FRBs). The spatial distribution of FRBs is expected to closely trace the cosmic one for young stellar FRB progenitors. The cosmic SFR function can be written as

    $$\begin{aligned} \psi (z) = 0.015 \frac{(1+z)^{2.7}}{1+[(1+z)/2.9]^{5.6}} . \end{aligned}$$
    (10)
  • Uniform The uniform distribution assumes that the FRBs distribution is constant and its density function is given by

    $$\begin{aligned} P_\mathrm{{Uniform}} = \frac{1}{z_{max} - z_{min}}. \end{aligned}$$
    (11)

For completeness, we also consider an additional distribution, where the FRBs redshifts are picked at equidistant points (ED) between \(z_{min}\) and \(z_{max}\).

In Fig. 1 we present the three redshift distribution models for FRBs. Since for \(z > 1.5\), the GP reconstruction of the Pantheon data overestimates the uncertainty values (given the small number of points in such interval), we will simulate data points in the \(0.022 \le z \le 1.5\) interval.

Fig. 2
figure 2

The results of our simulations for \(f_{IGM,0}\) and \(DM_{host,0}\). The data points represent the average values of these parameters for each distribution model discussed in the text, considering different sizes of sample and values of DM fluctuations

The steps of our simulations are the following:

  1. 1.

    We generate random points using the redshift distribution models described above in the redshift range [0.022, 1.5]. We consider samples with N = 15, 30, 100 and 500 points.

  2. 2.

    We calculate the fiducial \(DM_{ext}\) (\(DM_{ext}^{fid}\)) using Eq. (6), where \(DM_{IGM}\) is given by Eq. (3). We adopt the mean values of baryon fraction and host contribution as reported in [1] for the constant case, i.e., \(f_{IGM,0} = 0.764\) and \(DM_{host,0} = 158.1\) pc/cm\(^{3}\). In our simulations, we also adopt the values of \(H_{0} = 74.03 \pm 1.4\) kms\(^{-1}\)Mpc\(^{-1}\) [57], \(\Omega _{m} = 0.3153 \) [63] and \(\Omega _{b}h^{2} = 0.02235 \pm 0.00037\) [64].

  3. 3.

    We calculate the uncertainty of \(DM_{ext}\) simulated (\(\sigma _{ext}^{sim}\)). The \(DM_{IGM}\) and \(DM_{host,0}\) uncertainties are not well constrained, so we calculate \(\sigma _{ext}^{sim}\) performing a regression of observational data of relative error. As long as the relative error decreases with z and cannot be negative, we consider the relative error described by an hyperbolic function which is \(\eta = \sigma _{ext}^{obs}/DM_{ext}^{obs} = A/z \), where A hyperbolic regression free parameter.

  4. 4.

    Finally, we calculate the simulated \(DM_{ext}\) by assuming a normal distribution, given by \(DM_{ext}^{sim} (z) = \mathcal {N}(DM_{ext}^{fid},sd)\). Here, sd represents the standard deviation of the Gaussian Distribution, which is obtained from the average distance between the observed and fiducial points.

We perform the steps above 50 times for each sample size of the distribution models, which is enough to obtain convergence (see Supplementary material Appendix A). In each simulation, we calculate the free parameters while considering different values of DM fluctuations \(\delta = 0, 100, 200, 400, 230\sqrt{z}\) pc/cm\(^{3}\). Regarding the \(DM_{host,0}\), we assume in our MCMC analysis a Gaussian prior for this parameter, with the mean value and standard deviation being the best-fit values shown in Table 2. Subsequently, we calculate the average of each ensemble of 50 simulations.

Table 3 The results of our simulations for \(f_{IGM,0}\) and \(DM_{host,0}\) considering the distribution models discussed in the text

5 Results

The results of our simulations are displayed in Fig. 2 and Table 3. In Fig. 2, we present the 1\(\sigma \) error bars for the free parameters \(f_{IGM,0}\) and \(DM_{host,0}\), considering different redshift distributions and values of \(\delta = 0, 100, 200, 400, 230\sqrt{z}\) pc/cm\(^{3}\). Table 3 shows the numerical values obtained separately for all distributions and different numbers of points in each realization (\(N = 15, 30, 100, 500\)).

For all distributions (except for the sample \(N =15\)) the constraints on \(f_{IGM,0}\) and \(DM_{host,0}\) are compatible within \(2\sigma \). Comparing the results of simulations for \(N = 15\) with the results for the current observational data (which also comprises 15 points), we find that: (i) for \(\delta = 0\) pc/cm\(^{3}\), all distributions are not in agreement for \(f_{IGM,0}\) within \(2\sigma \); (ii) for \(DM_{host,0}\), differently from the SFR distribution, GRB, Uniform and ED distributions agree at \(2\sigma \); (iii) for the other values of the DM fluctuation, the results from the redshift distributions are in agreement within \(1\sigma \) for both parameters \(f_{IGM,0}\) and \(DM_{host,0}\)

Finally, it is worth mentioning that the errors on the \(f_{IGM,0}\) and \(DM_{host,0}\) parameters depend on the number of points and the DM fluctuations. Our results show that such errors are smaller for a given value of the DM fluctuations as larger number of points is considered. On the other hand, the errors increase for results with the same number of points N and higher values of \(\delta \). Therefore, these results show that larger data samples, as expected by the next generations of surveys, play a crucial role in this kind of analysis, along with a better understanding of the DM fluctuations parameter.

6 Conclusions

FRB observations have demonstrated a great potential to constrain cosmological parameters and test aspects of fundamental physics. In this context, although some of their astrophysical characteristics are still under debate, the growing significance of these transient events in cosmology is becoming apparent. Therefore, it is important to investigate the constraining power of upcoming FRB observations on physical and cosmological parameters to better understand their potential and limitations.

In this work, we investigated the impact of the DM fluctuations and the number of FRBs observations to constrain the parameters \(f_{IGM,0}\) and \(DM_{host,0}\) from simulated data considering distinct probability distributions for the sources. Firstly, we performed a statistical analysis with 15 observational data points following the model-independent method presented in [1]. Our sample was defined from an original sample of 20 data points, where we removed five sources for different reasons, e.g. discrepant values for the uncertainties or redshift incompatibility with the SNe catalogue. Secondly, we generated data sets from Monte Carlo simulations considering four redshift distributions, namely Gamma-ray Bursts, Star Formation Rate, Uniform and Equidistant distributions. The number of points in the analyses varied from \( N = 15, 30, 100, 500\), as expected from upcoming projects, whereas the DM fluctuations assumed values of \(\delta = 0, 100, 200, 400, 230\sqrt{z}\) pc/cm\(^{3}\).

The results showed an agreement within 2\(\sigma \) between the GRB, SFR, Uniform and ED distributions, regardless of the values of \(\delta \). In particular, our analysis highlighted the crucial role of DM fluctuations in the results, which reinforces the need for more investigations into this quantity. As an example, for \(N = 100\), as expected by the ASKAP/CRACO per year [29], we found that the expected relative error for \(f_{IGM,0}\) varies from \(\sim 0.2\%\) (\(\delta = 0\) pc/cm\(^{3}\)) to \(6\%\) (\(\delta = 400\) pc/cm\(^{3}\)) and from \(\sim 2\%\) (\(\delta = 0\) pc/cm\(^{3}\)) to \(60\%\) (\(\delta = 400\) pc/cm\(^{3}\)) for \(DM_{host,0}\) (see Table 3).

Finally, we would like to emphasize that the method and simulated data generated in our analysis can be used to forecast model-independent constraints on astrophysical and cosmological parameters, as reported in this paper, and investigate expected limits on the physical parameters of fundamental theories. Some applications are in progress and will appear in a future communication.