1 Introduction

Statistical distributions are commonly applied to describe real world phenomena. Due to the usefulness of statistical distributions, their theory is widely studied and new distributions are developed. The interest in developing more flexible statistical distributions remains strong in statistics profession. Many generalized classes of distributions have been developed and applied to describe various phenomena. A common feature of these generalized distributions is that they have more parameters. Johnson et al. [19] stated that the use of four-parameter distributions should be sufficient for most practical purposes. According to these authors, at least three parameters are needed but they doubted any noticeable improvement arising from including a fifth or sixth parameter.

The Pearson system of continuous distributions, as developed by Pearson [31], is a system for which every probability density function (p.d.f.) \(f(x)\) satisfies a differential equation of the form

$$\begin{aligned} \frac{1}{f(x)}\frac{df(x)}{dx}=\frac{a+x}{{{b}_{0}}+{{b}_{1}}x+{{b}_{2}}{{x}^{2}}}, \end{aligned}$$
(1.1)

where \(a, b_0, b_1\), and \({{b}_{2}}\) are the parameters (see Johnson et al. [19, Chapter 12]). The shape of the function \(f(x)\) depends on the parameters. The different shapes of the distribution were classified by Pearson into a number of types. The different types correspond to the different forms of solution to (1.1). The form of solution of (1.1) depends on the roots of the equation \({{b}_{0}}+{{b}_{1}}x+{{b}_{2}}{{x}^{2}}=0\). An example is when \({{b}_{1}} = {{b}_{2}} = 0\), which led to the normal distribution and this is not assigned to a particular type. For a detailed discussion of the various types, see Chapter 12 of Johnson et al. [19]

Burr [5] presented a system of continuous distributions which can take on a wide variety of shapes. The system of distributions satisfy the differential equation

$$\begin{aligned} dF=F(1-F)g(x)dx, \end{aligned}$$
(1.2)

where \(0\le F\le 1\) and \(g(x)\) is a non-negative function over \(x\). Burr [5] gave 12 solutions to the equation in (1.2) and these correspond to the choices of \(g(x)\). See Fry [15] and Johnson et al. [19] for a list of Burr Types I–XII distributions.

Johnson [18] proposed a system for generating distributions using normalization transformation with the general form

$$\begin{aligned} Z=\gamma +\delta f\left( \frac{x-\xi }{\lambda } \right) , \end{aligned}$$
(1.3)

where \(f(.)\) is the transformation function, \(Z\) is a standardized normal random variable, \(\gamma \) and \(\delta \) are shape parameters, \(\lambda \) is a scale parameter and \(\xi \) is a location parameter. Without loss of generality, Johnson assumed that \(\delta \) and \(\lambda \) are positive. He proposed three transformation functions and defined the lognormal family, the bounded system of distributions and unbounded system of distributions. These families of distributions cover many commonly used distributions such as normal, log-normal, gamma, beta, exponential distributions, and others. For more discussions, see Johnson et al. [19, p.33].

Tukey [37] proposed lambda distribution, which was generalized by Ramberg and Schmeiser [32, 33] and Ramberg et al. [34] as the so-called generalized lambda distributions (GLD). This family of distributions is defined in terms of percentile function

$$\begin{aligned} Q(y)=Q(y;\,{{\lambda }_{1}},\,{{\lambda }_{2}},\,{{\lambda }_{3}},\,{{\lambda }_{4}})={{\lambda }_{1}}+\frac{{{y}^{{{\lambda }_{3}}}}-{{(1-y)}^{{{\lambda }_{4}}}}}{{{\lambda }_{2}}},\quad \text{ where } \text{0 }\le y\le \text{1 }. \end{aligned}$$
(1.4)

The parameters \({{\lambda }_{1}}\) and \({{\lambda }_{2}}\) are, respectively, location and scale parameters, while \({{\lambda }_{3}}\) and \({{\lambda }_{4}}\) determine the skewness and kurtosis. The corresponding p.d.f. is given by

$$\begin{aligned} f(x)=\frac{{{\lambda }_{2}}}{{{\lambda }_{3}}{{y}^{{{\lambda }_{3}}-1}}+{{\lambda }_{4}}{{(1-y)}^{{{\lambda }_{4}}-1}}},\quad \mathrm{with}\,x=Q(y). \end{aligned}$$
(1.5)

The existence of a valid p.d.f. requires the condition that \({{\lambda }_{3}}{{y}^{{{\lambda }_{3}}-1}}+{{\lambda }_{4}}{{(1-y)}^{{{\lambda }_{4}}-1}}\) has the same sign for all \(y\) in \([0,\;1]\) and that \({{\lambda }_{2}}\) takes the same sign. Freimer et al. [14] discussed the similarity and differences between the Pearson’s system and the GLD. They pointed out that Pearson’s family does not include logistic distribution, while GLD does not cover all skewness and kurtosis values. An extended GLD was proposed by Karian and Dudewicz [23] that consists of both GLD and generalized beta distribution defined as

$$\begin{aligned} f(x)=\left\{ \begin{array}{ll} \frac{{{(x-{{\beta }_{1}})}^{{{\beta }_{3}}}}{{({{\beta }_{1}}+{{\beta }_{2}}-x)}^{{{\beta }_{4}}}}}{B({{\beta }_{3}}+1,\,{{\beta }_{4}}+1)\beta _{2}^{({{\beta }_{3}}+{{\beta }_{4}}+1)}}, &{}\quad \text{ for } {{\beta }_{1}}\le x\le {{\beta }_{1}}+{{\beta }_{2}} \\ 0, &{}\quad \text{ otherwise, } \end{array} \right. \end{aligned}$$
(1.6)

where \(B(.,\, .)\) is the complete beta function. For detailed discussion about the GLD and extended GLD, see Karian and Dudewicz [23].

Azzalini [4] introduced the skew normal family of distributions. Suppose \(X\) and \(Y\) are independent random variables, each with a p.d.f. that is symmetric about zero. For any \(\lambda \),

$$\begin{aligned} 0.5=P(X-\lambda Y<0)=\int \limits _{ -\infty }^{ \infty }{{{f}_{Y}}(y){{F}_{X}}(\lambda y)dy}. \end{aligned}$$
(1.7)

Thus, \(2{{f}_{Y}}(y){{F}_{X}}(\lambda y)\) is a probability density function. If \(X\) and \(Y\) are each standard normal, \(N(0,\,1)\), then the skew-normal family of distributions has the p.d.f.

$$\begin{aligned} 2\varphi (x)\Phi (\lambda x), \end{aligned}$$
(1.8)

where \(\varphi (x)\) and \(\Phi (x)\) are \(N(0,\,1)\) p.d.f. and cumulative distribution function respectively. The distribution in (1.8) is characterized by a single parameter \(\lambda \). Location and scale parameters can be added to the distribution in (1.8) by using the translation \(Y=\mu +\sigma X\). For skew normal distribution and other systems of continuous distributions, see Johnson et al. [19, Chapter 12].

Eugene et al. [10] used the beta distribution as a generator to develop the so-called family of beta-generated distributions. The cumulative distribution function (c.d.f.) of a beta-generated random variable \(X\) is defined as

$$\begin{aligned} G(x)=\int \limits _{ 0}^{ F(x)}{b(t)dt}, \end{aligned}$$
(1.9)

where \(b(t)\) is the p.d.f. of the beta random variable and \(F(x)\) is the c.d.f. of any random variable. The p.d.f. corresponding to the beta-generated distribution in (1.9) is given by

$$\begin{aligned} g(x)=\frac{1}{B(\alpha ,\beta )}f(x){{F}^{\alpha -1}}(x){{(1-F(x))}^{\beta -1}}. \end{aligned}$$
(1.10)

This family of distributions is a generalization of the distributions of order statistics for the random variable \(X\) with c.d.f. \(F(x)\) as pointed out by Eugene et al. [10] and Jones [21]. Since the paper by Eugene et al. [10], many beta-generated distributions have been studied in the literature including the beta-Gumbel distribution by Nadarajah and Kotz [29], beta-exponential distribution by Nadarajah and Kotz [30], beta-Weibull distribution by Famoye et al. [12], beta-gamma by distribution Kong et al. [24], beta-Pareto by distribution Akinsete et al. [1], and others.

Recently, Jones [22] and Cordeiro and de Castro [6] extended the beta-generated family of distributions by replacing the beta distribution in (1.9) with the Kumaraswamy distribution, \(b(t)=\alpha \beta {{x}^{\alpha -1}}{{(1-{{x}^{\alpha }})}^{\beta -1}},\,\,x\in (0, 1)\), Kumaraswamy [25]. The p.d.f. of the Kumaraswamy generalized distributions (\(KW\)-\(G\)) is given by

$$\begin{aligned} g(x)=\alpha \beta f(x){{F}^{\alpha -1}}(x){{(1-{{F}^{\alpha }}(x))}^{\beta -1}}. \end{aligned}$$
(1.11)

Several generalized distributions from (1.11) have been studied in the literature including the Kumaraswamy Weibull distribution by Cordeiro et al. [7], the Kumaraswamy generalized gamma distribution by de Castro et al. [9], and the Kumaraswamy generalized half-normal distribution by Cordeiro et al. [8].

Ferreira and Steel [13] introduced a method to generate skewed distributions through inverse probability integral transformations. According to Ferreira and Steel [13], a distribution \(G\) is said to be a skewed version of the symmetric distribution \(F\), generated by the skewing mechanism \(P\), if its p.d.f. is of the form

$$\begin{aligned} g(y|F,P)=f(y)p(F(y)). \end{aligned}$$
(1.12)

Note that the p.d.f. (1.12) is a weighted function of \(f(.)\) with the weight \(p(F(.))\). The skewed normal family in (1.8) is a special case of this family. By relaxing the assumption \(F(.)\) being symmetric, the beta-generated family (1.10) is a special case of (1.12).

This article presents yet another technique to generate families of continuous probability distributions. The article is organized as follows: Sect. 2 presents a new technique for generating families of continuous distributions. Section 3 gives examples of classes of generalized families developed using the technique in Sect. 2. The paper ends with a summary and conclusion in Sect. 4.

2 Method for generating families of continuous probability distributions

The beta-generated family of distributions in (1.10) and the \(KW\)-\(G\) family of distributions in (1.11) are generated by using distributions with support between \(0\) and \(1\) as the generator. The beta random variable and the \(KW\) random variable lie between \(0\) and \(1\), so is the c.d.f. \(F(x)\) of any other random variable. The limitation of using a generator with support lying between \(0\) and \(1\) raises an interesting question: ‘Can we use other distributions with different support as the generator to derive different classes of distributions?’ This section will address this question and introduce a new technique to derive families of distributions by using any p.d.f. as a generator.

Let \(r(t)\) be the p.d.f. of a random variable \(T\in [a,b]\), for \(-\infty \le a<b\le \infty \). Let \(W( F(x))\) be a function of the c.d.f. \(F(x)\) of any random variable \(X\) so that \(W( F(x))\) satisfies the following conditions:

$$\begin{aligned} \left. \begin{array}{l} W(F(x))\in [a,b] \\ W(F(x)) \text{ is } \text{ differentiable } \text{ and } \text{ monotonically } \text{ non-decreasing } \\ W(F(x))\rightarrow a \text{ as } x\rightarrow -\infty \text{ and } W(F(x))\rightarrow b \text{ as } x\rightarrow \infty . \end{array} \right\} \end{aligned}$$
(2.1)

A method for generating new families of distribution is presented in the following definition.

Definition:

Let \(X\) be a random variable with p.d.f. \(f(x)\) and c.d.f. \(F(x)\). Let \(T\) be a continuous random variable with p.d.f. \(r(t)\) defined on \([a,\, b]\). The c.d.f. of a new family of distributions is defined as

$$\begin{aligned} G(x)=\int \limits _{ a}^{ W(F(x))}{r(t)dt}, \end{aligned}$$
(2.2)

where \(W(F(x))\) satisfies the conditions in (2.1). The c.d.f. \(G(x)\) in (2.2) can be written as \(G(x)=R\{ W(F(x))\}\), where \(R(t)\) is the c.d.f. of the random variable \(T\). The corresponding p.d.f. associated with (2.2) is

$$\begin{aligned} g(x)=\left\{ \frac{d}{dx}W(F(x)) \right\} r\{ W(F(x))\}. \end{aligned}$$
(2.3)

Note that:

  • The c.d.f. in (2.2) is a composite function of \((R\mathbf \huge . W\mathbf \huge . F)(x)\).

  • The p.d.f. \(r(t)\) in (2.2) is “transformed” into a new c.d.f. \(G(x)\) through the function, \(W(F(x))\), which acts as a “transformer”. Hence, we shall refer to the distribution \(g(x)\) in (2.3) as transformed from random variable \(T\) through the transformer random variable \(X\) and call it “Transformed-Transformer” or “\(T\)-\(X\)” distribution.

  • The random variable \(X\) may be discrete and in such a case, \(G(x)\) is the c.d.f. of a family of discrete distributions.

  • The distribution (1.12) introduced by Ferreira and Steel [13] is a special case of (2.3) by defining \(W(F(x))=F(x)\) and \(r(.)\) plays the same role as the weight function.

Different \(W(F(x))\) will give a new family of distributions. The definition of \(W(F(x))\) depends on the support of the random variable \(T\). The following are some examples of \(W(.)\).

  1. 1.

    When the support of \(T\) is bounded: Without loss of generality, we assume the support of \(T\) is \([0,\, 1]\). Distributions for such \(T\) include uniform \((0,\, 1)\), beta, Kumaraswamy and other types of generalized beta distributions. \(W( F(x))\) can be defined as \(F(x)\) or \({{F}^{\alpha }}(x)\). This is the beta-generated family of distributions which have been well studied during the recent decade.

  2. 2.

    When the support of \(T\) is \([a,\, \infty ), a \ge 0\): Without loss of generality, we assume \(a = 0. W(F(x))\) can be defined as \(-\log (1-F(x)), F(x)/(1-F(x)), -\log (1-{{F}^{\alpha }}(x))\), and \({{F}^{\alpha }}(x)/(1-{{F}^{\alpha }}(x))\), where \(\alpha >0\).

  3. 3.

    When the support of \(T\) is \((-\infty ,\, \infty )\): \(W( F(x))\) can be defined as \(\log [-\log (1-F(x))], \log [F(x)/(1-F(x))], \log [-\log (1-{{F}^{\alpha }}(x))]\), and \(\log [{{F}^{\alpha }}(x)/(1-{{F}^{\alpha }}(x))]\).

By using the \(W(F(x))=-\log (1-F(x))\) in the second example, the \(G(x)\) in (2.2) is a c.d.f. of the new family of distributions which is given by

$$\begin{aligned} G(x)=\int \limits _{ 0}^{ -\log (1-F(x))}{r(t)dt}=R\{ -\log (1-F(x)) \}, \end{aligned}$$
(2.4)

where \(R(t)\) is the c.d.f. of the random variable \(T\). The corresponding p.d.f. associated with (2.4) is

$$\begin{aligned} g(x)=\frac{f(x)}{1-F(x)}r(-\log (1-F(x)))\,=h(x)\,r(-\log (1-F(x))), \end{aligned}$$
(2.5)

where \(h(x)\) is the hazard function for the random variable \(X\) with the c.d.f. \(F(x)\).

The corresponding families of distributions generated from the other \(W(.)\) functions mentioned in examples 2 and 3 are given in Table 1.

Table 1 Probability density functions of some \(T\)-\(X\) families based on different \(W(.)\) functions

In the remainder of this article, we will focus on the case when \(T\) has the support \([0,\, \infty )\) and \(W(F(x))=-\log (1-F(x))\). For simplicity, we will use the name \(T\)-\(X\) family of distributions for the new family of distributions in (2.5).

Some remarks on the family of distributions defined in (2.5):

  1. (a)

    The p.d.f. in (2.5) can be written as \(g(x)=h(x)r(H(x))\) and the corresponding c.d.f. is \(G(x)=R( -\log (1-F(x)))=R(H(x))\), where \(h(x)\) and \(H(x)\) are hazard and cumulative hazard functions of the random variable \(X\) with c.d.f. \(F(x)\). Hence, this family of distributions can be considered as a family of distributions arising from a weighted hazard function.

  2. (b)

    The fact that \(G(x)=R(-\log (1-F(x)))\) gives the relationship between random variables \(X\) and \(T\text{: }\; X={{F}^{-1}}(1-{{e}^{-T}})\). This provides an easy way for simulating random variable \(X\) by first simulating random variable \(T\) from p.d.f. \(r(t)\) and computing \(X={{F}^{-1}}( 1-{{e}^{-T}})\), which has the c.d.f. \(G(x)\). Thus, \(E(X)\) can be obtained using \(E(X)=E\{ {{F}^{-1}}( 1-{{e}^{-T}})\}\).

The quantile function, \(Q(\lambda ),\,\,0<\lambda <1\), for the \(T\)-\(X\) family of distributions can be computed by using the formula

$$\begin{aligned} Q(\lambda )={{F}^{-1}}\{ 1-{{e}^{-{{R}^{-1}}(\lambda )}}\}. \end{aligned}$$
(2.6)

The Shannon [35] entropy of a random variable \(X\) is a measure of variation of uncertainty. Shannon entropy is defined as \(E\{ -\log (g(X))\}\). Theorem 1 shows the connection between the Shannon entropy of the new family of distributions, \(g(x)\), and the Shannon entropy of the generator, \(r(t)\).

Theorem 1

If a random variable \(X\) follows the family of distributions \(g(x)=\frac{f(x)}{1-F(x)}r(-\log \) \((1-F(x)))\) in (2.5), then the Shannon entropy of \(X\), \({{\eta }_{X}}\), is given by

$$\begin{aligned} {{\eta }_{X}}=-E\{\log f({{F}^{-1}}(1-{{e}^{-T}}))\}-{{\mu }_{T}}+{{\eta }_{T}}, \end{aligned}$$
(2.7)

where \({{\mu }_{T}}\) and \({{\eta }_{T}}\) are the mean and the Shannon entropy for the random variable \(T\) with p.d.f. \(r(t)\).

Proof

By definition,

$$\begin{aligned} {{\eta }_{X}}&= E(-\log [g(X)]) \\&= -E(\log f(X))+E(\log (1-F(X)))+E(-\log r\{ -\log (1-F(X))\}). \end{aligned}$$

From (2.4), the random variable \(T=-\log ( 1-F(X))\) has the p.d.f. \(r(t)\) which implies the following: \(E(\log f(X))=E\{ \log f( {{F}^{-1}}(1-{{e}^{-T}}))\}, E( \log (1-F(X)) )=-E(T)=-{{\mu }_{T}}\), and \(E(-\log r\{ -\log (1-F(X))\})=E(-\log r(t))={{\eta }_{T}}\).

Hence, \({{\eta }_{X}}=-E\{ \log f( {{F}^{-1}}( 1-{{e}^{-T}}))\}-{{\mu }_{T}}+{{\eta }_{T}}\), which is the result in (2.7). \(\square \)

Skewness and kurtosis of a parametric distribution are often measured by \({{\alpha }_{3}}={{\mu }_{3}}/{{\sigma }^{3}}\) and \({{\alpha }_{4}}={{\mu }_{4}}/{{\sigma }^{4}}\), respectively. When the third or fourth moment does not exist, for example, Cauchy, Lévy and Pareto distributions, \({{\alpha }_{3}}\) and \({{\alpha }_{4}}\) cannot be computed. For the \(T\)-\(X\) family, one may encounter some difficulty in computing the third and fourth moments. Alternative measures for skewness and kurtosis, based on quantile functions, are sometimes more appropriate for such distributions. The measure of skewness \(S\) defined by Galton [16] and the measure of kurtosis \(K\) defined by Moors [27] are based on quantile functions and they are defined as

$$\begin{aligned} S&= \frac{Q(6/8)-2Q(4/8)+Q(2/8)}{Q(6/8)-Q(2/8)}, \end{aligned}$$
(2.8)
$$\begin{aligned} K&= \frac{Q(7/8)-Q(5/8)+Q(3/8)-Q(1/8)}{Q(6/8)-Q(2/8)}. \end{aligned}$$
(2.9)

Skewness measures the degree of the long tail (towards left or right side). Kurtosis is a measure of the degree of tail heaviness. When the distribution is symmetric, \(S = 0\) and when the distribution is right (or left) skewed, \(S > 0\) (or \(< 0\)). As \(K\) increases, the tail of the distribution becomes heavier. For the \(T\)-\(X\) family, Galton’s skewness and Moors’ kurtosis can be computed by using the quantile function in (2.6) and the appropriate \(T\) and \(X\) distributions.

3 Some families of \(T\)-\(X\) distributions with different \(T\) distributions

The \(T\)-\(X\) family of distributions can be further classified into two sub-families: One sub-family has the same \(X\) distribution but different \(T\) distributions and the other sub-family has the same \(T\) distribution but different \(X\) distributions. For example, by letting \(T\) be a Weibull random variable, we generate a sub-family of Weibull-\(X\) distributions. By letting \(X\) be a Weibull random variable, we generate a sub-family of \(T\)-Weibull distributions. In this section, we consider the sub-family with different \(T\) distributions. Table 2 gives several such sub-families with the same \(X\) and different \(T\) random variables.

Table 2 Families of generalized distributions derived from different \(T\) distributions

In each of the following sub-sections, we discuss the properties of the gamma-\(X\) family, beta-exponential-\(X\) family, and Weibull-\(X\) family.

3.1 Gamma-\(X\) family

If a random variable \(T\) follows the gamma distribution with parameters \(\alpha \) and \(\beta \), then \(r(t)={{(\Gamma (\alpha ){{\beta }^{\alpha }})}^{-1}}{{t}^{\alpha -1}}{{e}^{-t/\beta }},\,\,\,\,t>0\). From (2.5), the p.d.f. of gamma-\(X\) family is defined as

$$\begin{aligned} g(x)=\frac{1}{\Gamma (\alpha ){{\beta }^{\alpha }}}f(x){{( -\log (1-F(x)))}^{\alpha -1}}{{( 1-F(x))}^{\frac{1}{\beta }-1}}. \end{aligned}$$
(3.1)

By using (2.4) and expressing the c.d.f. of the gamma distribution in terms of the incomplete gamma function, \(R(t)=( 1/\Gamma (\alpha ))\gamma (\alpha ,t/\beta )\), where \(\gamma (\alpha ,t)=\int _{0}^{t}{{{u}^{\alpha -1}}{{e}^{-u}}du}\), the c.d.f. of the gamma-\(X\) family in (3.1) is \(G(x)=\gamma \{ \alpha ,-\log (1-F(x))\}/\Gamma (\alpha )\). We will refer to distributions of the form \({{(F(x))}^{c}}\) and \({{(1-F(x))}^{c}}\) as, respectively, \(\text{ Exp }(F)\) and \(\text{ Exp }(1-F)\) family of distributions.

Lemma 1

The Shannon entropy of the gamma-\(X\) family of distributions is given by \({{\eta }_{X}}=-E\{ \log f({{F}^{-1}}( 1-{{e}^{-T}}))\}+\alpha (1-\beta )+\log \beta +\log \Gamma (\alpha )+(1-\alpha )\psi (\alpha )\), where \(\psi \) is the digamma function.

Proof

It follows from Theorem 1 by using \({{\mu }_{T}}=\alpha \beta \) and the Shannon entropy for the gamma distribution, which is given by Song [36] as \({{\eta }_{T}}\!=\!\alpha \!+\!\log \beta \!+\!\log \Gamma (\alpha )\!+\!(1-\alpha )\psi (\alpha )\). \(\square \)

When \(\alpha =1\), the gamma-\(X\) family in (3.1) reduces to \(\text{ Exp }(1-F)\) distributions. When \(\alpha = n\) and \(\beta = 1\), the gamma-\(X\) family is the density function of the \(n\)th upper record value arising from a sequence \(\{ {{X}_{i}}\}\) of identically independent random variables with the p.d.f. \(f(x)\) and c.d.f. \(F(x)\) (see Johnson et al. [19, p. 99]). The generalized gamma distribution defined by Amoroso [3] is a member of gamma-\(X\) family where \(X\) is the Weibull random variable. If \(f(x)\) is the p.d.f. of the Weibull distribution, then (3.1) becomes

$$\begin{aligned} g(x)=\frac{c}{{{\gamma }^{\alpha c}}\Gamma (\alpha ){{\beta }^{\alpha }}}{{x}^{\alpha c-1}}{{e}^{-\frac{1}{\beta }{{(x/\gamma )}^{c}}}},\quad x>0; \; \alpha ,\; \gamma ,\; \beta , \; c>0. \end{aligned}$$
(3.2)

Setting \(\delta =\beta {{\gamma }^{c}}\) in Eq. (3.2), the distribution reduces to the generalized gamma distribution in Amoroso [3]. When \(c=\gamma =1\), (3.2) reduces to the gamma distribution.

If \(f(x)\) is the p.d.f. of the Pareto distribution, then from (3.1) we get

$$\begin{aligned} g(x)=\frac{{{k}^{\alpha }}}{x\Gamma (\alpha ){{\beta }^{\alpha }}}{{\left( \frac{\theta }{x} \right) }^{k/\beta }}{{\left( \log \left( \frac{x}{\theta } \right) \right) }^{\alpha -1}},\quad x>\theta . \end{aligned}$$

On setting \(\beta /k=c\), we get

$$\begin{aligned} g(x)=\frac{1}{x\Gamma (\alpha ){{c}^{\alpha }}}{{\left( \frac{\theta }{x} \right) }^{1/c}}{{\left( \log \left( \frac{x}{\theta } \right) \right) }^{\alpha -1}}, \quad x>\theta . \end{aligned}$$
(3.3)

Based on our naming convention, the distribution in (3.3) will be called gamma-Pareto distribution. When \(\alpha = 1\), (3.3) reduces to the Pareto distribution and hence the gamma-Pareto distribution can be considered as a generalization of the Pareto distribution. Figure 1 shows graphs of the gamma-Pareto density for different parameter values including the special cases. The figure shows that the shape parameter \(\alpha \) adds extra flexibility to the distribution by changing the shape of the density function from reversed J-shape to concave down shape for certain parameter values.

Fig. 1
figure 1

Graphs of the gamma-Pareto distribution for various parameter values

The c.d.f. of the gamma-Pareto distribution in (3.3) is \(G(x)=\gamma \{ \alpha ,{{c}^{-1}}\log ( x/\theta ) \}/\Gamma (\alpha )\), and hence the quantile function of the gamma-Pareto distribution is the solution of equation \(G(x)=p,\,\,0\le p\le 1\). To investigate the effect of the two shape parameters \(\alpha \) and \(c\) on the gamma-Pareto density function, Eqs. (2.8) and (2.9) are used to obtain Galton’s skewness and Moors’ kurtosis. Figure 2 displays the Galton’s skewness and Moors’ kurtosis for the gamma-Pareto distribution in terms of the parameters \(\alpha \) and \(c\) when \(\theta =1.\)

Fig. 2
figure 2

Galton’s skewness(\(S\)) and Moors’ kurtosis(\(K\)) for the gamma-Pareto distribution

From Fig. 2 and the corresponding data values (not included to save space), the Galton’s skewness is always positive which indicates that the gamma-Pareto distribution is right skewed. For fixed \(c\ge 1\), the Galton’s skewness is an increasing function of \(\alpha \). For fixed \(c < 1\), the Galton’s skewness is a decreasing function of \(\alpha \) and for fixed \(\alpha \), the Galton’s skewness is an increasing function of \(c\). The Moors’ kurtosis is an increasing function of \(\alpha \) and \(c\).

3.2 Beta-exponential-\(X\) family

If a random variable \(T\) follows the beta-exponential distribution in Nadarajah and Kotz [30], then \(r(t)=\lambda {{(B(\alpha ,\beta ))}^{-1}}{{e}^{-\lambda \beta t}}{{(1-{{e}^{-\lambda t}})}^{\alpha -1}}\). From (2.5), the p.d.f. of the beta-exponential-\(X\) family is defined as

$$\begin{aligned} g(x)=\frac{\lambda }{B(\alpha ,\beta )}f(x){{( 1-F(x) )}^{\lambda \beta -1}}{{\{ 1-{{( 1-F(x) )}^{\lambda }}\}}^{\alpha -1}}. \end{aligned}$$
(3.4)

The c.d.f. of (3.4) can be expressed in terms of the incomplete beta function \({{I}_{x}}(a,b)\). The c.d.f. of the beta-exponential-\(X\) family is \(G(x)=1-{{I}_{{{(1-F(x))}^{\lambda }}}}( \lambda (\beta -1)+1,\alpha )\).

Lemma 2

The Shannon entropy of the beta-exponential-\(X\) family of distributions is given by

$$\begin{aligned} {{\eta }_{X}}&= -E\{ \log f( {{F}^{-1}}( 1-{{e}^{-T}}))\}+\log ({{\lambda }^{-1}}B(\alpha ,\beta ))+(\alpha +\beta -1)\psi (\alpha +\beta ) \\ \nonumber&\quad -(\alpha -1)\psi (\alpha )-\beta \psi (\beta ) -[\psi (\alpha +\beta )-\psi (\beta )]/\lambda . \end{aligned}$$

Proof

It follows from Theorem 1 by using the mean \({{\mu }_{T}}=[\psi (\alpha +\beta )-\psi (\beta )]/\lambda \) and the Shannon entropy \({{\eta }_{T}}=\log ({{\lambda }^{-1}}B(\alpha ,\beta ) )+(\alpha +\beta -1)\psi (\alpha +\beta )-(\alpha -1)\psi (\alpha )-\beta \psi (\beta )\) for the beta-exponential distribution, which are given by Nadarajah and Kotz [30]. \(\square \)

Special cases of beta-exponential-\(X\) family:

  1. (1)

    The beta-generated family in (1.10) is a special case of (3.4) when \(\lambda = 1\). Hence, the family of distributions in (3.4) can be used to generate all the distributions belonging to the beta-generated family.

  2. (2)

    When \(\alpha =1\), the beta-exponential-\(X\) family reduces to the \(\text{ Exp }(1-F)\) distributions. When \(\beta =1\) and \(\lambda =1\), the beta-exponential-\(X\) reduces to the \(\text{ Exp }(F)\) distributions.

  3. (3)

    When \(\beta = 1\), (3.4) reduces to the exponentiated-exponential-\(X\) family with p.d.f.

$$\begin{aligned} g(x)=\alpha \lambda f(x){{\{ 1-{{( 1-F(x) )}^{\lambda }} \}}^{\alpha -1}}{{( 1-F(x))}^{\lambda -1}}. \end{aligned}$$
(3.5)

The c.d.f of (3.5) can be written as \(G(x)={{\{ 1-{{( 1-F(x))}^{\lambda }}\}}^{\alpha }}\).

By using \(D(x)=1-F(x)\) in (3.5), the exponentiated-exponential-\(X\) family reduces to the \(KW\)-\(G\) family.

If \(X\) is the uniform random variable, then from (3.4) the beta-exponential-uniform is defined as

$$\begin{aligned} g(x)=\frac{\lambda }{B(\alpha ,\beta )}\frac{1}{b-a}{{\left( \frac{b-x}{b-a} \right) }^{\lambda \beta -1}}{{\left\{ 1-{{\left( \frac{b-x}{b-a} \right) }^{\lambda }} \right\} }^{\alpha -1}}, \quad a<x<b. \end{aligned}$$
(3.6)

If we use the transformation \(y=1-x\) in (3.6) then the distribution reduces to the (i) generalized beta distribution of the first kind (McDonald [26]), when \(b = 1\), (ii) beta distribution when \(a = 0\) and \(b=\lambda =1\), and (iii) Kumaraswamy’s [25] double bounded distribution when \(a = 0\) and \(b=\beta =1\).

The exponentiated-Weibull distribution defined by Mudholkar et al. [28] is a member of exponentiated-exponential-\(X\) family in (3.5) when \(X\) is the Weibull random variable. If \(f(x)\) is the p.d.f. of the Weibull distribution, then (3.5) reduces to

$$\begin{aligned} g(x)=\frac{c\lambda \alpha }{\gamma }{{\left( \frac{x}{\gamma } \right) }^{c-1}}{{( 1-{{e}^{-\lambda {{(x/\gamma )}^{c}}}} )}^{\alpha -1}}{{e}^{-\lambda {{(x/\gamma )}^{c}}}},\quad x>0; \; c,\gamma ,\alpha ,\lambda >0. \end{aligned}$$
(3.7)

Writing \(\delta =\lambda {{\gamma }^{c}}\), (3.7) reduces to the exponentiated-Weibull distribution given by Mudholkar et al. [28]. When \(\gamma =c= 1\), (3.7) reduces to the exponentiated-exponential distribution defined by Gupta and Kundu [17]. When \(\lambda =c=1\), (3.7) reduces to the Weibull distribution. When \(\lambda =\gamma =c=1\), (3.7) reduces to the exponential distribution.

The type I generalized logistic distribution given by Johnson et al. [20, p. 140], is a special case of exponentiated-exponential-logistic distribution. If \(f(x)\) is the p.d.f. of the standard logistic distribution then (3.5) reduces to

$$\begin{aligned} g(x)=\frac{\alpha \lambda {{e}^{-\lambda x}}}{{{( 1+{{e}^{-x}} )}^{\lambda +1}}}{{\left( 1-\frac{{{e}^{-\lambda x}}}{{{( 1+{{e}^{-x}})}^{\lambda }}} \right) }^{\alpha -1}},\quad -\infty <x<\infty ,\alpha ,\lambda >0. \end{aligned}$$
(3.8)

When \(\lambda =1\), the exponentiated-exponential-logistic distribution in (3.8) reduces to type I generalized logistic distribution. When \(\alpha = \lambda = 1\), (3.8) reduces to standard logistic distribution.

Figure 3 shows graphs of the exponentiated-exponential-logistic density functions for different parameter values including the special cases.

Fig. 3
figure 3

Graphs of the exponentiated-exponential-logistic distribution for various parameter values

The c.d.f. of the exponentiated-exponential-logistic distribution in equation (3.8) is \(G(x)={{( 1-{{(1+{{e}^{x}})}^{-\lambda }})}^{\alpha }},\) and hence the quantile function of the exponentiated-exponential-logistic distribution can be written as

$$\begin{aligned} Q(p)=\log ( {{( 1-{{p}^{1/\alpha }})}^{-1/\lambda }}-1),\quad 0\le p\le 1. \end{aligned}$$
(3.9)

By using (3.9), (2.8) and (2.9), one can obtain the Galton’s skewness and the Moors’ kurtosis for the exponentiated-exponential-logistic distribution. Figure 4 displays the Galton’s skewness and Moors’ kurtosis for the exponentiated-exponential-logistic distribution in terms of the parameters \(\alpha \) and \(\lambda \).

Fig. 4
figure 4

Galton’s skewness(\(S\)) and Moors’ kurtosis(\(K\)) for the exponentiated-exponential-logistic distribution

From Fig. 4 and the corresponding data values (not included in order to save space), the exponentiated-exponential-logistic distribution can be left skewed, right skewed, and symmetric. For fixed \(\lambda >1\), the Galton’s skewness is an increasing function of \(\alpha \), and for fixed \(\alpha \), the Galton’s skewness is a decreasing function of \(\lambda \). For fixed \(\alpha \), the Moors’ kurtosis is a decreasing function of \(\lambda \) when \(\lambda >1\), and for fixed \(\lambda \), the Moors’ kurtosis is a decreasing function of \(\alpha \) when \(\alpha >1\).

3.3 Weibull-\(X\) family

If a random variable \(T\) follows the Weibull distribution with parameters \(c\) and \(\gamma \), then \(r(t)=( c/\beta ){{( t/\beta )}^{c-1}}{{e}^{-{{( t/\beta )}^{c}}}},\,\,\,t\ge 0\). From (2.5) the Weibull-\(X\) family is given by

$$\begin{aligned} g(x)=\frac{c}{\beta }\frac{f(x)}{1-F(x)}{{\left\{ \frac{-\log ( 1-F(x) )}{\beta } \right\} }^{c-1}}\exp \left\{ -{{\left( \frac{-\log ( 1-F(x))}{\beta } \right) }^{c}} \right\} . \end{aligned}$$
(3.10)

The c.d.f. of the Weibull distribution is \(R(t)=1-{{e}^{-{{( t/\beta )}^{c}}}}\) and hence from (2.4) the c.d.f. of the Weibull-\(X\) family is

$$\begin{aligned} G(x)=1-\exp \{ -{{[ -\log (1-F(x))/\beta ]}^{c}}\}. \end{aligned}$$
(3.11)

Lemma 3

The Shannon entropy of the Weibull-\(X\) family of distributions is given by

$$\begin{aligned} {{\eta }_{X}}=-E\{ \log f( {{F}^{-1}}( 1-{{e}^{-T}} ))\}-\beta \,\Gamma (1+1/c)+\gamma (1-1/c)-\log (c/\beta )+1, \end{aligned}$$

where \(\gamma \) is the Euler’s constant.

Proof

It follows from Theorem 1 by using the mean \({{\mu }_{T}}=\beta \,\Gamma (1+1/c)\) and the Shannon entropy \({{\eta }_{T}}=\gamma (1-1/c)-\log (c/\beta )+1\) for the Weibull distribution, which is given by Song [36]. \(\square \)

When \(c=1\), the Weibull-\(X\) family reduces to the \(\text{ Exp }(1-F(x))\) distributions. The type II generalized logistic distribution is a special case of Weibull-logistic distribution. If \(F(x)\) is the c.d.f. of the standard logistic distribution then (3.11) reduces to

$$\begin{aligned} G(x)=1-\exp \{ -{{[ \log ( 1+{{e}^{x}})/\beta ]}^{c}} \},\quad -\infty <x<\infty . \end{aligned}$$
(3.12)

When \(c =1\), the distribution in (3.12) reduces to type II generalized logistic distribution.

Figure 5 shows graphs of the Weibull-logistic density functions for different parameter values including the special case.

Fig. 5
figure 5

Graphs of the Weibull-logistic distribution for various parameter values

From (3.12), the quantile function of the Weibull-logistic distribution can be written as

$$\begin{aligned} Q(p)=\log \{ \exp [\beta {{(-\log (1-p))}^{1/c}}]-1 \},\quad 0\le p\le 1. \end{aligned}$$
(3.13)

Equations (3.13), (2.8) and (2.9) can be used to obtain Galton’s skewness and Moors’ kurtosis. Figure 6 displays the Galton’s skewness and Moors’ kurtosis for the Weibull-logistic distribution in terms of parameters \(\beta \) and c.

Fig. 6
figure 6

Galton’s skewness(\(S\)) and Moors’ kurtosis(\(K\)) for the Weibull-logistic distribution

Figure 6 and the corresponding data values (not included to save space) indicate that the Weibull-logistic distribution can be left skewed, right skewed, and symmetric. For fixed \(\beta \), the Galton’s skewness is a decreasing function of \(c\), and for fixed \(c\), the Galton’s skewness is an increasing function of \(\beta \). For fixed \(c\), the Moors’ kurtosis is an increasing function of \(\beta \) when \(c\le 1\) and a decreasing function of \(\beta \) when \(c>1\).

4 Summary and conclusion

A method to generate new families of distributions is introduced. This technique defines new family of distributions using the composite function \((R\mathbf \huge . W\mathbf \huge . F)(x)\) with \(R\) and \(F\) being the c.d.f.s of the random variables \(T\) and \(X\), respectively. The \(W(.)\) function is defined to link the support of \(T\) to the range of \(X\). This technique generates a large number of new distributions as well as existing distributions as special cases. Table 1 contains several different variants of \(T\)-\(X\) families using different \(W(.)\) functions.

This article focuses on \(W(F(x))=-\log ( 1-F(x) )\), where the support of \(T\) is \([0,\,\infty )\). Some properties of this \(T\)-\(X\) family are studied. Besides using functions of moments for measuring skewness and kurtosis, we suggest Galton’s measure of skewness and Moors’ measure of kurtosis. Three sub-families of \(T\)-\(X\) family, namely gamma-\(X\) family, beta-exponential-\(X\) family and Weibull-\(X\) family are discussed. These sub-families demonstrate that the \(T\)-\(X\) family consists of many sub-families of distributions. Within each sub-family, one can define many new distributions as well as relate its members to many existing distributions.

Table 2 summarizes various sub-families based on different \(T\) distributions with the same \(X\) distribution. New distributions discussed include gamma-Pareto, exponentiated-exponential-logistic and Weibull-logistic distributions. In general, it is difficult to see how the shapes of the \(T\) and \(X\) distributions will affect the \(T\)-\(X\) distribution. We believe that a relationship may exist for some specific \(T\) and \(X\) distributions. For the gamma distribution, \(\alpha \) is a shape parameter while \(\beta \) is a scale parameter. For the Pareto distribution, \(\theta \) is a scale parameter and \(k\) is a shape parameter. After forming the gamma-Pareto distribution, \(\theta \) remains a scale parameter, \(\beta /k = c\) becomes a shape parameter. The study of the properties, parameter estimation and applications of these new distributions are currently under investigation. For example, Alzaatreh et al. [2] defined and studied the gamma-Pareto distribution, a member of the gamma-\(X\) family. Three real data sets were used to illustrate the applications of the gamma-Pareto distribution. The illustration showed that the gamma-Pareto distribution is a good model to fit data sets with various kinds of shapes.

Figure 7 provides a tree-relationship of the \(T\)-\(X\) family addressed in this article. As Fig. 7 shows, the \(T\)-\(X\) family consists of many sub-families, in which new distributions can be defined and various existing distributions are special cases.

Fig. 7
figure 7

Sub-families of \(T\)-\(X\) family of distributions

The variants of \(T\)-\(X\) families in Table 1 will define many potential new distributions that deserve further study. Some of these variants are currently under investigation. Future research for the \(T\)-\(X\) family may include (i) the investigation of general properties of distributions generated using different \(W(.)\) functions, (ii) defining and investigating the properties of specific new distributions, (iii) studying new methods for estimating the parameters in addition to the well-known moments and maximum likelihood (ML) methods, and (iv) applying these new distributions to fit different types of data sets. Based on our experience, the ML method may be challenging for more than three parameters. A better estimation method will be needed for distributions with four or more parameters.

During the recent decade, many new distributions developed in the literature seem to focus on more general and flexible distributions. Using the technique that generates the \(T\)-\(X\) family, one can develop new distributions that may be very general and flexible or for fitting specific types of data distributions such as highly left-tailed (right-tailed, thin-tailed, or heavy-tailed) distribution as well as bimodal distributions. There are only a few existing distributions that are known to be capable of fitting bimodal shapes. One of such distributions that have been successfully applied to fit real world data sets is the beta-normal distribution by Eugene et al. [10] and Famoye et al. [11]. Our limited investigation in the \(T\)-\(X\) family suggests that there are new distributions that can fit not only unimodal and bimodal, but also multimodal distributions.

This article focuses on the case when both \(T\) and \(X\) are continuous random variables. This technique can be extended to develop discrete \(T\)-\(X\) family of distributions where \(T\) is continuous and \(X\) is discrete. Different considerations for the \(W(.)\) functions will be needed.