1 Introduction

Gamma-ray bursts (GRBs) are the most powerful explosions known in the Universe, with an emission peak in the 200–500 keV region, and the total isotropic energy released of the order \(10^{51}\mbox{--}10^{54}~\mbox{ergs}\) (for recent reviews, see Nakar 2007; Zhang 2011; Gehrels and Razzaque 2013; Berger 2014; Mészáros and Rees 2015). They are also one of the most distant astronomical objects discovered, with the highest known redshift of \(z\sim9.4\) measured for GRB090429B (Cucchiara et al. 2011). Mazets et al. (1981) first pointed out hints for a bimodal distribution of \(T_{b}\) (taken to be the time interval within which fall 80–90 % of the measured GRB’s intensity) drawn for 143 events detected in the KONUS experiment. Kouveliotou et al. (1993) also found a bimodal structure in the \(\log T_{90}\) distribution of 222 events from CGRO/BATSE, based on which GRBs are commonly divided into short (\(T_{90}<2~\mbox{s}\)) and long (\(T_{90}>2~\mbox{s}\)) classes, where \(T_{90}\) is the time interval from 5 % to 95 % of the accumulated fluence. While generally short GRBs are of merger origin (Nakar 2007) and long ones come from collapsars (Woosley and Bloom 2006), this classification is imperfect due to a large overlap in duration distributions of the two populations (Lü et al. 2010; Bromberg et al. 2011, 2013; Shahmoradi 2013; Shahmoradi and Nemiroff 2015; Tarnopolski 2015c). Horváth (1998) and Mukherjee et al. (1998) independently discovered a third peak in the duration distribution in the BATSE 3B catalog, located between the short and long groups, and the statistical existence of this intermediate class was claimed to be supported (Horváth 2002) with the use of BATSE 4B data. Interestingly, using clustering techniques, Chattopadhyay et al. (2007) established the optimal number of classes to be three, too. Also in Swift/BAT data evidence for a third component in \(\log T_{90}\) was announced (Horváth et al. 2008; Zhang and Choi 2008; Huja et al. 2009; Horváth et al. 2010; Zitouni et al. 2015). Other datasets, i.e. RHESSI (Řípa et al. 2009) and BeppoSAX (Horváth 2009), are both in agreement with earlier results regarding the bimodal distribution, and the detection of a third component was established on a lower, compared to BATSE and Swift, significance level. Thence, four different satellites provided hints about the existence of a third class of GRBs.

Those conclusions were based on the finding that a mixture of three standard Gaussians (a 3-G) is a better fit than a mixture of two Gaussians (a 2-G). This is not surprising, because adding parameters to a nested model always results in a better fit (in the sense of a lower \(\chi^{2}\) or a higher maximum log-likelihood ℒ) due to more freedom given to the model to follow the data. The important questions are whether this improvement is statistically significant, can the three components be related to physically distinct classes, and whether the model is an appropriate one—is there a model that is a better fit? (See Tarnopolski 2015a, 2015b for a discussion.) However, even quantifying the relative improvement via \(p\)-valuesFootnote 1 is not a definite detection of another physical class of astronomical objects. All of the post-BATSE 3B fits were bimodal, not trimodal, even if comprised of three components. The third peak in the BATSE 3B sample (Horváth 1998) was smeared out with the BATSE 4B catalog when more data was gathered (see Fig. 5 in Zitouni et al. 2015). It was suggested by Zitouni et al. (2015) that the duration distribution corresponding to the collapsar scenario might not necessarily be symmetrical because of a non-symmetrical distribution of envelope masses of the progenitors. Specifically, it was shown by Tarnopolski (2015a) that the \(\log T_{90}\) distribution of GRBs detected by Fermi is also bimodal for several binnings. Moreover, a number of intrinsically skewed distributions were fitted to the data of BATSE, Swift and Fermi (Tarnopolski 2015b), and it was found that mixtures of two skewed components follow the data at least as good (BATSE and Swift), or better (Fermi) than a conventionally used 3-G, and that they are bimodal as well (in the sense of having two local maxima; Schilling et al. 2002). Generally, \(n\)-modality is commonly associated with \(n\) populations underlying a distribution. Hence, the existence of an intermediate GRB class is unlikely.

The analysis of the observed durations was performed by many authors, as reviewed above. However, the intrinsic duration—the one in the rest frame—of a GRB is affected by its cosmological distance, and is shorter than the observed one:

$$ T^{\mathrm{int}}_{90}=\frac{T^{\mathrm{obs}}_{90}}{1+z}. $$
(1)

Considering the median redshift of long GRBs, \(\tilde{z}_{\mathrm{long}}\approx2\), it is evident that GRBs with \(T^{\mathrm{obs}}_{90}\lesssim6~\mbox{s}\) have an intrinsic duration generally smaller than 2 s, which makes them short ones. Note that the classification of short GRBs is the same in both the observer and rest frames. The analysis of the \(T^{\mathrm{int}}_{90}\) distribution was performed rarely due to a small number of GRBs with measured redshift: Zhang and Choi (2008) examined 95, Huja et al. (2009) analyzed 130, and Zitouni et al. (2015) investigated 248 Swift GRBs. While Zhang and Choi (2008) focused on the apparent bimodality, and Huja et al. (2009) did not translate the observed durations to the rest frame, Zitouni et al. (2015) found that a 3-G follows the Swift data better than a 2-G (in observer as well as in the rest frame; see their Figs. 6 and 7). However, in both frames the distributions were bimodal, yet apparently skewed, and hence the existence of an intermediate class is still unlikely. The plausible explanation of this phenomenon is that there are two GRB classes with intrinsically non-symmetrical duration distributions.

The aim of this article is to perform a statistical analysis of the GRBs with measured redshift in order to test against the existence of the intermediate GRB class. Mixtures of various distributions (standard Gaussians, skew-normal, sinh-arcsinh and alpha-skew-normal) are applied to verify whether the statistical significance of a three-Gaussian fit might by challenged by a mixture of skewed distributions with only two components. Both the observed and intrinsic durations are examined.

This article is organized as follows. In Sect. 2 the dataset, fitting methods and the properties of the examined distributions are described as outlined by Tarnopolski (2015b). In Sect. 3 the study of the sample of all GRBs with measured redshift is presented. This is followed by an analysis of Swift GRBs with known redshift in Sect. 4. Section 5 is devoted to discussion, and in Sect. 6 concluding remarks are given.

2 Data and methods

2.1 Dataset

A sample of 408 GRBs with measured both the observed durations \(T^{\mathrm{obs}}_{90}\) and redshifts \(z\) is used.Footnote 2 It contains 334 GRBs detected by Swift, constituting the second sample examined herein. The sample of all GRBs consists of 386 long GRBs and 22 short ones. The latter all come from Swift observations, except one that was detected by HETE (GRB040924). A scatter plot of the data on a redshift–logarithm of duration plane is drawn in Fig. 1. The median redshifts for short and long GRBs are equal to \(\tilde{z}_{\mathrm{short}}=0.72\) and \(\tilde{z}_{\mathrm{long}}=1.76\), respectively. The intrinsic durations are calculated according to Eq. (1). Distributions of the \(\log T_{90}\) for the observed and intrinsic durations are examined hereinafter, and are displayed in Fig. 2 for the sample of all GRBs.

Fig. 1
figure 1

A scatter plot of the redshifts versus the observed durations. Vertical dotted line marks the limitting value of 2 s between short and long GRBs, and the horizontal dashed lines denote the medians of the respective classes, with values written in the plot. All GRBs with known both \(z\) and \(T^{\mathrm{obs}}_{90}\) are shown

Fig. 2
figure 2

Distributions of the observed (dashed red) and intrinsic (dotted blue) durations in the sample of all (408) GRBs

2.2 Fitting method

Two standard fitting techniques are commonly applied: \(\chi^{2}\) fitting and maximum likelihood method (ML). For the first, data needs to be binned, and despite various binning rules are known (e.g. Freedman-Diaconis, Scott, Knuth etc.), they still leave place for ambiguity, as it might happen that the fit may be statistically significant on a given significance level for a number of binnings (Huja and Řípa 2009; Koen and Bere 2012; Tarnopolski 2015a). The ML method is not affected by this issue and is therefore applied herein. However, for display purposes, the binning was chosen based on the Freedman-Diaconis rule.

Having a distribution with a probability density function (PDF) given by \(f=f(x;\theta)\) (possibly a mixture), where \(\theta= \{\theta_{i} \}_{i=1}^{p}\) is a set of \(p\) parameters, the log-likelihood function is defined as

$$ \mathcal{L}_{p}(\theta)=\sum\limits _{i=1}^{N} \ln f(x_{i};\theta), $$
(2)

where \(\{x_{i} \}_{i=1}^{N}\) are the datapoints from the sample to which a distribution is fitted. The fitting is performed by searching a set of parameters \(\hat{\theta}\) for which the log-likelihood is maximized (Kendall and Stuart 1973). When nested models are considered, the maximal value of the log-likelihood function, \(\mathcal{L}_{\max}\equiv\mathcal{L}_{p}(\hat{\theta})\), increases when the number of parameters \(p\) increases.

For nested as well as non-nested models, the Akaike information criterion (\(\mathit{AIC}\)) (Akaike 1974; Burnham and Anderson 2004; Liddle 2007; Tarnopolski 2015b) may be applied. The \(\mathit{AIC}\) is defined as

$$ \mathit{AIC}=2p-2\mathcal{L}_{\max}. $$
(3)

A preferred model is the one that minimizes \(\mathit{AIC}\). The formulation of \(\mathit{AIC}\) penalizes the use of an excessive number of parameters, hence discourages overfitting. It prefers models with fewer parameters, as long as the others do not provide a substantially better fit. The expression for \(\mathit{AIC}\) consists of two competing terms: the first measuring the model complexity (number of free parameters) and the second measuring the goodness of fit (or more precisely, the lack of thereof). Among candidate models with \(\mathit {AIC}_{i}\), let \(\mathit{AIC}_{\min}\) denote the smallest. Then,

$$ \Pr\nolimits_{i}=\exp \biggl(-\frac{\Delta_{i}}{2} \biggr), $$
(4)

where \(\Delta_{i}=\mathit{AIC}_{i}-\mathit{AIC}_{\min}\), can be interpreted as the relative (compared to \(\mathit{AIC}_{\min}\)) probability that the \(i\)-th model minimizes the \(\mathit{AIC}\).Footnote 3

The \(\mathit{AIC}\) is suitable when \(N/p\) is large, i.e. when \(N/p>40\) (Burnham and Anderson 2004, see also references therein). When this condition is not fulfilled, a second order bias correction is introduced, resulting in a small-sample version of the \(\mathit{AIC}\), called \(\mathit{AIC}_{c}\):

$$ \mathit{AIC}_{c}=2p-2\mathcal{L}_{\max}+\frac{2p(p+1)}{N-p-1}. $$
(5)

The relative probability is computed similarly to when \(\mathit{AIC}\) is used, i.e. Eq. (4) is valid when one takes \(\Delta_{i}=\mathit{AIC}_{c,i}-\mathit{AIC}_{c,\min}\). Thence,

$$ \Pr\nolimits_{i}=\exp \biggl(-\frac{\mathit{AIC}_{c,i}-\mathit{AIC}_{c,\min }}{2} \biggr). $$
(6)

It is important to note that this method allows to choose a model that is best among the chosen ones, but does not allow to state that this model is the best among all possible. Hence, the probabilities computed by means of Eq. (6) are the relative, with respect to a model with \(\mathit{AIC}_{c,\min}\), probabilities that the data is better described by a model with \(\mathit{AIC}_{c,i}\). What is essential in assessing the goodness of a fit in the \(\mathit{AIC}\) method is the difference, \(\Delta_{i}=\mathit{AIC}_{c,i}-\mathit{AIC}_{c,\min}\), not the absolute values of the \(\mathit{AIC}_{c,i}\).Footnote 4 If \(\Delta_{i}<2\), then there is substantial support for the \(i\)-th model, and the proposition that it is a more proper description is highly probable. If \(2<\Delta_{i}<4\), then there is strong support for the \(i\)-th model. When \(4<\Delta_{i}<7\), there is considerably less support, and models with \(\Delta_{i}>10\) have essentially no support (Burnham and Anderson 2004; Biesiada 2007).

2.3 Distributions and their properties

In nearly all researches conducted so far on the GRB duration distribution, three components were found to describe the observed distribution statistically better than a mixture of two components. However, in all previous analyses a mixture of standard (non-skewed) Gaussians was fitted. This might possibly lead to erroneous conclusions, as describing a non-symmetrical distribution by a mixture of symmetrical components will eventually lead to overfitting (some of the two-component skewed distributions considered below are characterized by fewer free parameters than a standard three-Gaussian). Moreover, Zitouni et al. (2015) suggested that the duration distribution of long GRBs might not necessarily be symmetrical because of a non-symmetrical distribution of envelope masses of the progenitors. Since McBreen et al. (1994) observed that the distribution of \(\log T_{90}\) may be in form of a mixture of standard Gaussians, many authors followed this approach and also restrained the analysis to non-skewed normal distributions (Koshut et al. 1996; Kouveliotou et al. 1996; Horváth 1998, 2002; Horváth et al. 2008, 2010; Horváth 2009; Zhang and Choi 2008; Huja et al. 2009; Huja and Řípa 2009; Řípa et al. 2009; Koen and Bere 2012; Barnacka and Loeb 2014; Tarnopolski 2015a). Therefore, in light of the suggestion of Zitouni et al. (2015) that the \(T_{90}\) distributions underlying the two well-established GRB classes (Kouveliotou et al. 1993; Woosley and Bloom 2006; Nakar 2007) may not be symmetrical (Tarnopolski 2015a), the following distributions are considered herein.

A mixture of \(k\) standard normal (Gaussian) \(\mathcal{N}(\mu,\sigma^{2})\) distributions:

$$\begin{aligned} f^{(\mathcal{N})}_{k}(x) &= \sum\limits _{i=1}^{k} A_{i} \varphi \biggl(\frac {x-\mu_{i}}{\sigma_{i}} \biggr) \\ &= \sum\limits _{i=1}^{k} \frac{A_{i}}{\sqrt{2\pi}\sigma_{i}}\exp \biggl(- \frac {(x-\mu_{i})^{2}}{2\sigma_{i}^{2}} \biggr), \end{aligned}$$
(7)

being described by \(p=3k-1\) free parameters: \(k\) pairs \((\mu_{i},\sigma_{i})\) and \(k-1\) weights \(A_{i}\), satisfying \(\sum_{i=1}^{k} A_{i}=1\). Skewness of each component is \(\gamma_{1}^{(\mathcal{N})}=0\).

A mixture of \(k\) skew normal (SN) distributions (O’Hagan and Leonard 1976; Azzalini 1985):

$$\begin{aligned} f^{(\mathit{SN})}_{k}(x) &= \sum\limits _{i=1}^{k} 2A_{i}\varphi \biggl(\frac{x-\mu _{i}}{\sigma_{i}} \biggr)\varPhi \biggl( \alpha_{i}\frac{x-\mu_{i}}{\sigma_{i}} \biggr) \\ &= \sum\limits _{i=1}^{k} \frac{2A_{i}}{\sqrt{2\pi}\sigma_{i}}\exp \biggl(- \frac {(x-\mu_{i})^{2}}{2\sigma_{i}^{2}} \biggr) \\ &\quad{}\times\frac{1}{2} \biggl[1+\textrm{erf} \biggl( \alpha_{i}\frac{x-\mu _{i}}{\sqrt{2}\sigma_{i}} \biggr) \biggr], \end{aligned}$$
(8)

described by \(p=4k-1\) parameters. Skewness of an SN distribution is

$$\gamma_{1}^{(\mathit{SN})}=\frac{4-\pi}{2}\frac{ (\zeta\sqrt{2/\pi} )^{3}}{ (1-2\zeta^{2}/\pi )^{3/2}}, $$

where \(\zeta=\frac{\alpha}{\sqrt{1+\alpha^{2}}}\), hence the skewness \(\gamma_{1}^{(\mathit{SN})}\) is solely based on the shape parameter \(\alpha\), and is limited to the interval \((-1,1 )\). The mean is given by \(\mu+\sigma\zeta\sqrt{\frac{2}{\pi}}\). When \(\alpha=0\), the SN distribution is reduced to a standard Gaussian, \(\mathcal{N}(\mu,\sigma^{2})\), due to \(\varPhi(0)=1/2\).

A mixture of \(k\) sinh-arcsinh (SAS) distributions (Jones and Pewsey 2009):

$$\begin{aligned} &f^{(\mathit{SAS})}_{k}(x) \\ &\quad= \sum\limits _{i=1}^{k} \frac{A_{i}}{\sigma_{i}} \biggl[1+ \biggl(\frac {x-\mu_{i}}{\sigma_{i}} \biggr)^{2} \biggr]^{-\frac{1}{2}} \\ &\qquad{} \times\beta_{i} \cosh \biggl[\beta_{i} \sinh^{-1} \biggl(\frac{x-\mu _{i}}{\sigma_{i}} \biggr)-\delta_{i} \biggr] \\ &\qquad{} \times\exp \biggl[-\frac{1}{2}\sinh \biggl[\beta_{i} \sinh^{-1} \biggl(\frac{x-\mu_{i}}{\sigma_{i}} \biggr)-\delta_{i} \biggr]^{2} \biggr], \end{aligned}$$
(9)

being described by \(p=5k-1\) parameters. It turns out that skewness of the SAS distribution increases with increasing \(\delta\), positive skewness corresponding to \(\delta>0\). Tailweight decreases with increasing \(\beta\), \(\beta<1\) yielding heavier tails than the normal distribution, and \(\beta>1\) yielding lighter tails. With \(\delta=0\) and \(\beta=1\), the SAS distribution reduces to a standard Gaussian, \(\mathcal{N}(\mu,\sigma^{2})\). Skewness of a SAS distribution is

$$\gamma_{1}^{(\mathit{SAS})}=\frac{1}{4} \biggl[ \sinh \biggl( \frac{3\delta}{\beta } \biggr)P_{3/\beta} - 3\sinh \biggl(\frac{\delta}{\beta} \biggr)P_{1/\beta } \biggr], $$

where

$$P_{q}=\frac{e^{1/4}}{\sqrt{8\pi}} \bigl[ K_{(q+1)/2}(1/4) + K_{(q-1)/2}(1/4) \bigr]. $$

Here, \(K\) is the modified Bessel function of the second kind. The mean is given by \(\mu+\sigma\sinh(\delta/\beta)P_{1/\beta}\).

A mixture of \(k\) alpha-skew-normal (ASN) distributions (Elal-Olivero 2010):

$$\begin{aligned} f^{(\mathit{ASN})}_{k}(x)&=\sum\limits _{i=1}^{k} A_{i}\frac{ (1-\alpha_{i}\frac{x-\mu _{i}}{\sigma_{i}} )^{2}+1}{2+\alpha_{i}^{2}} \varphi \biggl(\frac{x-\mu _{i}}{\sigma_{i}} \biggr) \\ &= \sum\limits _{i=1}^{k} A_{i}\frac{ (1-\alpha_{i}\frac{x-\mu_{i}}{\sigma_{i}} )^{2}+1}{2+\alpha_{i}^{2}} \\ &\quad {}\times\frac {1}{\sqrt{2\pi}\sigma_{i}}\exp \biggl(-\frac{(x-\mu_{i})^{2}}{2\sigma_{i}^{2}} \biggr), \end{aligned}$$
(10)

described by \(p=4k-1\) parameters. Skewness of an ASN distribution is

$$\gamma_{1}^{(\mathit{ASN})}=\frac{12\alpha^{5}+8\alpha^{3}}{(3\alpha^{4}+4\alpha^{2}+4)^{3/2}}, $$

and is limited to the interval \((-0.811,0.811)\). The mean is given by \(\mu-\frac{2\alpha\sigma}{2+\alpha^{2}}\). For \(\alpha\in(-1.34,1.34)\) the distribution is unimodal, and bimodal otherwise.

3 Study of the complete \(z\) sample

The biggest number of free parameters among the examined PDFs is \(p=14\) in the case of a 3-SAS. Combined with \(N=408\) for all GRBs, or \(N=334\) for the Swift subsample, this implies \(N/p<40\). Therefore, in what follows the \(\mathit{AIC}_{c}\) is used instead of the \(\mathit{AIC}\).

First, the sample of 408 GRBs with measured both redshift and duration is examined. The PDFs, given by Eqs. (7)–(10), with \(k=2\) or 3, are fitted to the \(\log T_{90}\) distributions using the ML method from Sect. 2.2. Next, the \(\mathit{AIC}_{c}\) is calculated according to Eq. (5). The best fit among the examined is the one that yields the smallest \(\mathit{AIC}_{c}\).

The results of the fitting procedure applied to the observed durations are gathered in Table 1, and the fits, in graphical form, are displayed in Fig. 3. Contrary to previous researches, all of the three-component PDFs (3-G, 3-SN, and 3-SAS, where the support for the latter is weak) are trimodal, and the third peak is located in the area of the presumed intermediate GRB class, i.e. within the range 2–10 s. To assess its significance more easily, the \(\mathit{AIC}_{c}\) and relative probabilities are plotted in Fig. 4. The PDF with minimal \(\mathit{AIC}_{c}\) is a conventional 3-G, and the second best fit is a 3-SN, with a relative probability of 90.6 %. A 2-SN, however, has substantial support, too, due to \(\Delta_{\text{2-SN}}=1.393\). The remaining two-component fits (2-G and 2-SAS), as well as a 1-ASN, yield a strong support having \(2<\Delta_{i}<4\), but the evidence is weaker than for the former three models. The remaining, 3-SAS and 2-ASN, have considerably less or no support.

Fig. 3
figure 3

Distributions fitted to \(\log T^{\mathrm{obs}}_{90}\) data of all GRBs. Color dashed curves are the components of the (black solid) mixture distribution. The panels show a mixture of (a) two standard Gaussians (2-G), (b) three standard Gaussians (3-G), (c) two skew-normal (2-SN), (d) three skew-normal (3-SN), (e) two sinh-arcsinh (2-SAS), (f) three sinh-arcsinh (3-SAS), (g) one alpha-skew-normal (1-ASN), and (h) two alpha-skew-normal (2-ASN) distributions

Fig. 4
figure 4

\(\mathit{AIC}_{c}\) and relative probability, Pr, for the models examined and for observed durations

Table 1 Parameters of the fits for the observed durations. Label corresponds to labels from Figs. 3 and 4. The smallest \(\mathit{AIC}_{c}\) is marked in bold, and \(p\) is the number of parameters in a model

The picture revealed by the rest frame duration distribution, \(T^{\mathrm{int}}_{90}\), is different. As displayed in Fig. 5, the 3-SN and 3-SAS are also trimodal, and the 3-G, with the durations being systematically shifted left-wards comparing with the observed durations, lost its third peak, leaving a bimodal distribution with a prominent shoulder in the area of the presumed intermediate GRBs. The parameters of the fits are gathered in Table 2. A remarkably different picture, compared to the result of the \(T^{\mathrm{obs}}_{90}\) analysis, follows from the \(\mathit{AIC}_{c}\) plot in Fig. 6. It turns out that the intrinsic durations are best described by a conventional 2-G; the second best model is a 3-G, having a relative probability of 17.1 %. The other models have considerably less or almost no support. This suggests that the intrinsic \(T_{90}\) distribution may be indeed bimodal.

Fig. 5
figure 5

The same as Fig. 3, but for \(\log T^{\mathrm {int}}_{90}\)

Fig. 6
figure 6

The same as Fig. 4, but for intrinsic durations

Table 2 Parameters of the fits for the intrinsic durations. Label corresponds to labels from Figs. 5 and 6. The smallest \(\mathit{AIC}_{c}\) is marked in bold, and \(p\) is the number of parameters in a model

To conclude this Section, the \(T_{90}^{\mathrm{obs}}\) distribution is possibly trimodal, and in the rest frame, due to the properties of Eq. (2), it turns into a bimodal.

4 Study of the Swift subsample

The reason for examining separately a smaller sample of 334 GRBs detected only by Swift is the fact that the \(T_{90}\) distributions and other features, e.g. sensitivity in different energy bands, are detector dependent (e.g., Horváth 2009; Horváth et al. 2006, 2010; Huja et al. 2009; Řípa et al. 2009; Bromberg et al. 2013; Zitouni et al. 2015; Tarnopolski 2015b, 2015c), and thus the sample examined in the previous Sect. 3 might be biased. The majority of GRBs with known redshift comes from Swift, and hence one might consider the detections made by other satellites a contamination that falsifies the outcome. Therefore, it is desired to investigate a sample in which all observations were made by the same instrument.

The analysis is performed in the same way as it was done in Sect. 3. Again, the observed durations are examined first. A striking difference is that all of the fits are at most bimodal, and unimodal when a 2-G, 1-ASN or 2-ASN is considered. The parameters of the fits are gathered in Table 3, and the fitted curves are displayed in Fig. 7. The uni- or bimodality is consistent with previous analyses performed on more complete samples of Swift GRBs (Horváth et al. 2008; Huja et al. 2009; Zitouni et al. 2015; Tarnopolski 2015b), and the curves for three-component fits show a prominent shoulder on the left-hand side of the peak related to long GRBs.

Fig. 7
figure 7

The same as Fig. 3, but for \(\log T^{\mathrm {obs}}_{90}\) of the Swift subsample

Fig. 8
figure 8

The same as Fig. 4, but for observed durations and Swift GRBs

Table 3 Parameters of the fits for the observed Swift durations. Label corresponds to labels from Figs. 7 and 8. The smallest \(\mathit{AIC}_{c}\) is marked in bold, and \(p\) is the number of parameters in a model

The \(\mathit{AIC}\) indicates that the distribution of \(\log T_{90}^{\mathrm{obs}}\) is best described by a 3-G. The next two best fits, a 1-ASN and a 2-SN, have a \(\Delta_{i}<2\), and hence yield strong support in their favor. Next, a 2-G has a relative probability of 24.3 % of being a more proper model. The remaining models have considerably less support. It follows that in case of the observed durations one cannot discern reliably the best description among a one- or two-component PDFs, what is also consistent with the previous analyses, as the Swift detection rate is heavily biased towards long GRBs (the ratio of short to long GRBs is \({<}1:14\)), hence the sample is strongly dominated by long GRBs. Hence, combined with the relatively low number of redshift-equipped GRBs, it appears that due to this domination is unambiguous, in terms of modality, classification of the \(T_{90}^{\mathrm{int}}\) distribution is uncertain.

When the intrinsic durations are considered, there appear some trimodal fits (see Fig. 9). Surprisingly, the model with the lowest \(\mathit{AIC}\) is a bimodal 2-ASN, while the second best fit is achieved by a (also bimodal) 3-G, having a relative probability of 13.9 %. The remaining models have significantly less support (compare with Table 4 and Fig. 10). The 2-ASN consists of a bimodal and a unimodal component. The former consists of two peaks with comparable height, and is visually very symmetric. The latter is skewed, with its mode placed near the peak of the bimodal component that corresponds to long GRBs. Hence, the overall role of the unimodal component is to rescale the bimodal one in a nonlinear way in order to follow the data. The structure of this fit is unusual and unexpected, as in the previous samples the 2-ASN model did not perform very well, being one of the worst fits.

Fig. 9
figure 9

The same as Fig. 3, but for \(\log T^{\mathrm {int}}_{90}\) and for Swift GRBs

Fig. 10
figure 10

The same as Fig. 4, but for intrinsic durations and the Swift GRBs

Table 4 Parameters of the fits for the intrinsic Swift durations. Label corresponds to labels from Figs. 9 and 10. The smallest \(\mathit{AIC}_{c}\) is marked in bold, and \(p\) is the number of parameters in a model

5 Discussion

Generally, inferring an existence or lack of thereof, based on statistical evidence, must be done with care. Having samples with limited size adds difficulty to such an assessment, as in small samples there is more room for statistical fluctuations that might obscure the global picture. Previous researches, cited in Sect. 1, mostly imply that a 3-G fit is a better descriptive model than a 2-G. Nevertheless, the fits achieved were bimodal, indicating the presence of only two GRB classes (Tarnopolski 2015a). A remarkable exception was the BATSE 3B dataset (Horváth 1998), where the third peak had a negligible probability of \(10^{-4}\) to be a chance occurrence. It turned out, however, that a bigger dataset obtained by the same instrument did not reveal its presence anymore (Horváth 2002). Examining the observed, instead of intrinsic, durations might also cast doubts on the reality of the observed phenomenon. Having that in mind, it is tempting to state that the intermediate GRB class is unlikely to be a real class based on the analysis of 408 GRBs with known both \(T_{90}\) and redshift. This statement could be justified with the results presented in Sect. 3, where the two best models to describe the \(\log T^{\mathrm{obs}}_{90}\) distribution were trimodal, but after moving to the rest frame, the most plausible description was provided with a conventional 2-G. It may appear that the intrinsic durations should trace the physical context of the GRBs more appropriately.

On the other hand, the GRB characteristics are not only sample-dependent, as showed above, but also detector-dependent (e.g., Horváth 2009; Horváth et al. 2006, 2010; Huja et al. 2009; Řípa et al. 2009; Bromberg et al. 2013; Zitouni et al. 2015; Tarnopolski 2015b, 2015c). Therefore, lacking a dataset numerous enough for the statistics to provide a convincing proof, one may only claim evidence, or its lack, in a specific sample under consideration (see also Tarnopolski 2015a, 2015b). To get rid of the detector-dependency, only 334 GRBs as detected by Swift were examined. The outcome of this analysis, shown in Sect. 4, is surprisingly inconsistent with the one from a bigger sample in Sect. 3, being at the same time consistent with previous analyses performed on a bigger sample of Swift GRBs—the obtained fits are all uni- or bimodal, and the one with the lowest \(\mathit{AIC}_{c}\) is a bimodal 3-G; the next best fits were a unimodal 1-ASN and a bimodal 2-SN. Both yield strong evidence in their favor, so it is not possible to unambiguously infer the number of components, or even the modality of the Swift sample.Footnote 5

After moving to the rest frame, the problems are not solved, especially that the best fit now is a 2-ASN. The problem with this distribution is that it consists of a bimodal component (see Fig. 9(h)), with locations of its peaks in agreement with the groups of short and long GRBs. It seems like the role of the second component here is to merely adjust the height of the fit. The second best fit, a bimodal 3-G, has a relative probability of 13.9 %. While this is a statistically valid result, meaning that among the examined distributions the 2-ASN is best balanced between the goodness of fit and the number of parameters, from the physical point of view, regarding the knowledge about GRBs, this result is an unrealistic one, as the short and long GRBs are known to stem from different progenitors, mergers and collapsars, respectively. Even after dismissing the 2-ASN, differentiating between a 3-G and 1-ASN is not possible in the framework of the \(\mathit{AIC}_{c}\), as these two models yield \(\Delta_{i}=0.535\). Hence, the currently available redshift distribution unfortunately does not allow to infer the existence of the intermediate GRBs class reliably, likely due to the smallness of the sample.

6 Conclusions

The research conducted so far on different samples of GRB duration distributions indicate that a 3-G follows the data better than a 2-G. However, even with three components, the fitted distribution is usually bimodal, implying two physical classes. Because a two-component mixture of skewed distributions was shown to be a statistically better fit, in case of Fermi/GBM observations, than the commonly applied 3-G (Tarnopolski 2015b), in this paper the same approach was undertaken to investigate the modality and goodness of fit in case of GRBs with measured both redshift and duration. The reason for this is that in the rest frame the effects of cosmological factors are mostly eliminated, hence it is expected that it will provide an insight into the properties of GRBs.

It was found that in a sample of 408 GRBs with known redshift, the best fits—3-G and 3-SN—are trimodal (in the sense of having three local maxima), but after moving to the rest frame, a (unimodal) 2-G yielded considerably stronger support than any other examined distribution. However, this sample is dominated by detections made by Swift/BAT (334 events, ≈82 % of the total number of GRBs with measured \(z\)), and hence this finding might be affected by the fact that GRB properties are detector-dependent. Therefore, the Swift/BAT subsample was also examined, and it turned out that it is not possible to reliably infer the best fit within the chosen information-theoretic framework (\(\mathit{AIC}_{c}\) in this work). This may be caused by the smallness of the sample, and so the solution is to, hopefully, repeat the analysis in the future on a wider GRB sample (see also Zhang and Xie 2007). Because the mathematical model of the observed as well as intrinsic durations is still lacking, the physical interpretation of the results obtained herein is limited. The distribution of intrinsic durations, being systematically shifted towards shorter values, while may be believed to trace the properties of GRB population more accurately, is also affected by statistical fluctuations. Considering the Swift subsample, the distributions are strongly dominated by long GRBs, what might cause introduction of biases in the analysis undertaken.