Abstract
In this paper, a new method is proposed for generating families of continuous distributions. A random variable \(X\), “the transformer”, is used to transform another random variable \(T\), “the transformed”. The resulting family, the \(T\)-\(X\) family of distributions, has a connection with the hazard functions and each generated distribution is considered as a weighted hazard function of the random variable \(X\). Many new distributions, which are members of the family, are presented. Several known continuous distributions are found to be special cases of the new distributions.
Similar content being viewed by others
1 Introduction
Statistical distributions are commonly applied to describe real world phenomena. Due to the usefulness of statistical distributions, their theory is widely studied and new distributions are developed. The interest in developing more flexible statistical distributions remains strong in statistics profession. Many generalized classes of distributions have been developed and applied to describe various phenomena. A common feature of these generalized distributions is that they have more parameters. Johnson et al. [19] stated that the use of four-parameter distributions should be sufficient for most practical purposes. According to these authors, at least three parameters are needed but they doubted any noticeable improvement arising from including a fifth or sixth parameter.
The Pearson system of continuous distributions, as developed by Pearson [31], is a system for which every probability density function (p.d.f.) \(f(x)\) satisfies a differential equation of the form
where \(a, b_0, b_1\), and \({{b}_{2}}\) are the parameters (see Johnson et al. [19, Chapter 12]). The shape of the function \(f(x)\) depends on the parameters. The different shapes of the distribution were classified by Pearson into a number of types. The different types correspond to the different forms of solution to (1.1). The form of solution of (1.1) depends on the roots of the equation \({{b}_{0}}+{{b}_{1}}x+{{b}_{2}}{{x}^{2}}=0\). An example is when \({{b}_{1}} = {{b}_{2}} = 0\), which led to the normal distribution and this is not assigned to a particular type. For a detailed discussion of the various types, see Chapter 12 of Johnson et al. [19]
Burr [5] presented a system of continuous distributions which can take on a wide variety of shapes. The system of distributions satisfy the differential equation
where \(0\le F\le 1\) and \(g(x)\) is a non-negative function over \(x\). Burr [5] gave 12 solutions to the equation in (1.2) and these correspond to the choices of \(g(x)\). See Fry [15] and Johnson et al. [19] for a list of Burr Types I–XII distributions.
Johnson [18] proposed a system for generating distributions using normalization transformation with the general form
where \(f(.)\) is the transformation function, \(Z\) is a standardized normal random variable, \(\gamma \) and \(\delta \) are shape parameters, \(\lambda \) is a scale parameter and \(\xi \) is a location parameter. Without loss of generality, Johnson assumed that \(\delta \) and \(\lambda \) are positive. He proposed three transformation functions and defined the lognormal family, the bounded system of distributions and unbounded system of distributions. These families of distributions cover many commonly used distributions such as normal, log-normal, gamma, beta, exponential distributions, and others. For more discussions, see Johnson et al. [19, p.33].
Tukey [37] proposed lambda distribution, which was generalized by Ramberg and Schmeiser [32, 33] and Ramberg et al. [34] as the so-called generalized lambda distributions (GLD). This family of distributions is defined in terms of percentile function
The parameters \({{\lambda }_{1}}\) and \({{\lambda }_{2}}\) are, respectively, location and scale parameters, while \({{\lambda }_{3}}\) and \({{\lambda }_{4}}\) determine the skewness and kurtosis. The corresponding p.d.f. is given by
The existence of a valid p.d.f. requires the condition that \({{\lambda }_{3}}{{y}^{{{\lambda }_{3}}-1}}+{{\lambda }_{4}}{{(1-y)}^{{{\lambda }_{4}}-1}}\) has the same sign for all \(y\) in \([0,\;1]\) and that \({{\lambda }_{2}}\) takes the same sign. Freimer et al. [14] discussed the similarity and differences between the Pearson’s system and the GLD. They pointed out that Pearson’s family does not include logistic distribution, while GLD does not cover all skewness and kurtosis values. An extended GLD was proposed by Karian and Dudewicz [23] that consists of both GLD and generalized beta distribution defined as
where \(B(.,\, .)\) is the complete beta function. For detailed discussion about the GLD and extended GLD, see Karian and Dudewicz [23].
Azzalini [4] introduced the skew normal family of distributions. Suppose \(X\) and \(Y\) are independent random variables, each with a p.d.f. that is symmetric about zero. For any \(\lambda \),
Thus, \(2{{f}_{Y}}(y){{F}_{X}}(\lambda y)\) is a probability density function. If \(X\) and \(Y\) are each standard normal, \(N(0,\,1)\), then the skew-normal family of distributions has the p.d.f.
where \(\varphi (x)\) and \(\Phi (x)\) are \(N(0,\,1)\) p.d.f. and cumulative distribution function respectively. The distribution in (1.8) is characterized by a single parameter \(\lambda \). Location and scale parameters can be added to the distribution in (1.8) by using the translation \(Y=\mu +\sigma X\). For skew normal distribution and other systems of continuous distributions, see Johnson et al. [19, Chapter 12].
Eugene et al. [10] used the beta distribution as a generator to develop the so-called family of beta-generated distributions. The cumulative distribution function (c.d.f.) of a beta-generated random variable \(X\) is defined as
where \(b(t)\) is the p.d.f. of the beta random variable and \(F(x)\) is the c.d.f. of any random variable. The p.d.f. corresponding to the beta-generated distribution in (1.9) is given by
This family of distributions is a generalization of the distributions of order statistics for the random variable \(X\) with c.d.f. \(F(x)\) as pointed out by Eugene et al. [10] and Jones [21]. Since the paper by Eugene et al. [10], many beta-generated distributions have been studied in the literature including the beta-Gumbel distribution by Nadarajah and Kotz [29], beta-exponential distribution by Nadarajah and Kotz [30], beta-Weibull distribution by Famoye et al. [12], beta-gamma by distribution Kong et al. [24], beta-Pareto by distribution Akinsete et al. [1], and others.
Recently, Jones [22] and Cordeiro and de Castro [6] extended the beta-generated family of distributions by replacing the beta distribution in (1.9) with the Kumaraswamy distribution, \(b(t)=\alpha \beta {{x}^{\alpha -1}}{{(1-{{x}^{\alpha }})}^{\beta -1}},\,\,x\in (0, 1)\), Kumaraswamy [25]. The p.d.f. of the Kumaraswamy generalized distributions (\(KW\)-\(G\)) is given by
Several generalized distributions from (1.11) have been studied in the literature including the Kumaraswamy Weibull distribution by Cordeiro et al. [7], the Kumaraswamy generalized gamma distribution by de Castro et al. [9], and the Kumaraswamy generalized half-normal distribution by Cordeiro et al. [8].
Ferreira and Steel [13] introduced a method to generate skewed distributions through inverse probability integral transformations. According to Ferreira and Steel [13], a distribution \(G\) is said to be a skewed version of the symmetric distribution \(F\), generated by the skewing mechanism \(P\), if its p.d.f. is of the form
Note that the p.d.f. (1.12) is a weighted function of \(f(.)\) with the weight \(p(F(.))\). The skewed normal family in (1.8) is a special case of this family. By relaxing the assumption \(F(.)\) being symmetric, the beta-generated family (1.10) is a special case of (1.12).
This article presents yet another technique to generate families of continuous probability distributions. The article is organized as follows: Sect. 2 presents a new technique for generating families of continuous distributions. Section 3 gives examples of classes of generalized families developed using the technique in Sect. 2. The paper ends with a summary and conclusion in Sect. 4.
2 Method for generating families of continuous probability distributions
The beta-generated family of distributions in (1.10) and the \(KW\)-\(G\) family of distributions in (1.11) are generated by using distributions with support between \(0\) and \(1\) as the generator. The beta random variable and the \(KW\) random variable lie between \(0\) and \(1\), so is the c.d.f. \(F(x)\) of any other random variable. The limitation of using a generator with support lying between \(0\) and \(1\) raises an interesting question: ‘Can we use other distributions with different support as the generator to derive different classes of distributions?’ This section will address this question and introduce a new technique to derive families of distributions by using any p.d.f. as a generator.
Let \(r(t)\) be the p.d.f. of a random variable \(T\in [a,b]\), for \(-\infty \le a<b\le \infty \). Let \(W( F(x))\) be a function of the c.d.f. \(F(x)\) of any random variable \(X\) so that \(W( F(x))\) satisfies the following conditions:
A method for generating new families of distribution is presented in the following definition.
Definition:
Let \(X\) be a random variable with p.d.f. \(f(x)\) and c.d.f. \(F(x)\). Let \(T\) be a continuous random variable with p.d.f. \(r(t)\) defined on \([a,\, b]\). The c.d.f. of a new family of distributions is defined as
where \(W(F(x))\) satisfies the conditions in (2.1). The c.d.f. \(G(x)\) in (2.2) can be written as \(G(x)=R\{ W(F(x))\}\), where \(R(t)\) is the c.d.f. of the random variable \(T\). The corresponding p.d.f. associated with (2.2) is
Note that:
-
The c.d.f. in (2.2) is a composite function of \((R\mathbf \huge . W\mathbf \huge . F)(x)\).
-
The p.d.f. \(r(t)\) in (2.2) is “transformed” into a new c.d.f. \(G(x)\) through the function, \(W(F(x))\), which acts as a “transformer”. Hence, we shall refer to the distribution \(g(x)\) in (2.3) as transformed from random variable \(T\) through the transformer random variable \(X\) and call it “Transformed-Transformer” or “\(T\)-\(X\)” distribution.
-
The random variable \(X\) may be discrete and in such a case, \(G(x)\) is the c.d.f. of a family of discrete distributions.
-
The distribution (1.12) introduced by Ferreira and Steel [13] is a special case of (2.3) by defining \(W(F(x))=F(x)\) and \(r(.)\) plays the same role as the weight function.
Different \(W(F(x))\) will give a new family of distributions. The definition of \(W(F(x))\) depends on the support of the random variable \(T\). The following are some examples of \(W(.)\).
-
1.
When the support of \(T\) is bounded: Without loss of generality, we assume the support of \(T\) is \([0,\, 1]\). Distributions for such \(T\) include uniform \((0,\, 1)\), beta, Kumaraswamy and other types of generalized beta distributions. \(W( F(x))\) can be defined as \(F(x)\) or \({{F}^{\alpha }}(x)\). This is the beta-generated family of distributions which have been well studied during the recent decade.
-
2.
When the support of \(T\) is \([a,\, \infty ), a \ge 0\): Without loss of generality, we assume \(a = 0. W(F(x))\) can be defined as \(-\log (1-F(x)), F(x)/(1-F(x)), -\log (1-{{F}^{\alpha }}(x))\), and \({{F}^{\alpha }}(x)/(1-{{F}^{\alpha }}(x))\), where \(\alpha >0\).
-
3.
When the support of \(T\) is \((-\infty ,\, \infty )\): \(W( F(x))\) can be defined as \(\log [-\log (1-F(x))], \log [F(x)/(1-F(x))], \log [-\log (1-{{F}^{\alpha }}(x))]\), and \(\log [{{F}^{\alpha }}(x)/(1-{{F}^{\alpha }}(x))]\).
By using the \(W(F(x))=-\log (1-F(x))\) in the second example, the \(G(x)\) in (2.2) is a c.d.f. of the new family of distributions which is given by
where \(R(t)\) is the c.d.f. of the random variable \(T\). The corresponding p.d.f. associated with (2.4) is
where \(h(x)\) is the hazard function for the random variable \(X\) with the c.d.f. \(F(x)\).
The corresponding families of distributions generated from the other \(W(.)\) functions mentioned in examples 2 and 3 are given in Table 1.
In the remainder of this article, we will focus on the case when \(T\) has the support \([0,\, \infty )\) and \(W(F(x))=-\log (1-F(x))\). For simplicity, we will use the name \(T\)-\(X\) family of distributions for the new family of distributions in (2.5).
Some remarks on the family of distributions defined in (2.5):
-
(a)
The p.d.f. in (2.5) can be written as \(g(x)=h(x)r(H(x))\) and the corresponding c.d.f. is \(G(x)=R( -\log (1-F(x)))=R(H(x))\), where \(h(x)\) and \(H(x)\) are hazard and cumulative hazard functions of the random variable \(X\) with c.d.f. \(F(x)\). Hence, this family of distributions can be considered as a family of distributions arising from a weighted hazard function.
-
(b)
The fact that \(G(x)=R(-\log (1-F(x)))\) gives the relationship between random variables \(X\) and \(T\text{: }\; X={{F}^{-1}}(1-{{e}^{-T}})\). This provides an easy way for simulating random variable \(X\) by first simulating random variable \(T\) from p.d.f. \(r(t)\) and computing \(X={{F}^{-1}}( 1-{{e}^{-T}})\), which has the c.d.f. \(G(x)\). Thus, \(E(X)\) can be obtained using \(E(X)=E\{ {{F}^{-1}}( 1-{{e}^{-T}})\}\).
The quantile function, \(Q(\lambda ),\,\,0<\lambda <1\), for the \(T\)-\(X\) family of distributions can be computed by using the formula
The Shannon [35] entropy of a random variable \(X\) is a measure of variation of uncertainty. Shannon entropy is defined as \(E\{ -\log (g(X))\}\). Theorem 1 shows the connection between the Shannon entropy of the new family of distributions, \(g(x)\), and the Shannon entropy of the generator, \(r(t)\).
Theorem 1
If a random variable \(X\) follows the family of distributions \(g(x)=\frac{f(x)}{1-F(x)}r(-\log \) \((1-F(x)))\) in (2.5), then the Shannon entropy of \(X\), \({{\eta }_{X}}\), is given by
where \({{\mu }_{T}}\) and \({{\eta }_{T}}\) are the mean and the Shannon entropy for the random variable \(T\) with p.d.f. \(r(t)\).
Proof
By definition,
From (2.4), the random variable \(T=-\log ( 1-F(X))\) has the p.d.f. \(r(t)\) which implies the following: \(E(\log f(X))=E\{ \log f( {{F}^{-1}}(1-{{e}^{-T}}))\}, E( \log (1-F(X)) )=-E(T)=-{{\mu }_{T}}\), and \(E(-\log r\{ -\log (1-F(X))\})=E(-\log r(t))={{\eta }_{T}}\).
Hence, \({{\eta }_{X}}=-E\{ \log f( {{F}^{-1}}( 1-{{e}^{-T}}))\}-{{\mu }_{T}}+{{\eta }_{T}}\), which is the result in (2.7). \(\square \)
Skewness and kurtosis of a parametric distribution are often measured by \({{\alpha }_{3}}={{\mu }_{3}}/{{\sigma }^{3}}\) and \({{\alpha }_{4}}={{\mu }_{4}}/{{\sigma }^{4}}\), respectively. When the third or fourth moment does not exist, for example, Cauchy, Lévy and Pareto distributions, \({{\alpha }_{3}}\) and \({{\alpha }_{4}}\) cannot be computed. For the \(T\)-\(X\) family, one may encounter some difficulty in computing the third and fourth moments. Alternative measures for skewness and kurtosis, based on quantile functions, are sometimes more appropriate for such distributions. The measure of skewness \(S\) defined by Galton [16] and the measure of kurtosis \(K\) defined by Moors [27] are based on quantile functions and they are defined as
Skewness measures the degree of the long tail (towards left or right side). Kurtosis is a measure of the degree of tail heaviness. When the distribution is symmetric, \(S = 0\) and when the distribution is right (or left) skewed, \(S > 0\) (or \(< 0\)). As \(K\) increases, the tail of the distribution becomes heavier. For the \(T\)-\(X\) family, Galton’s skewness and Moors’ kurtosis can be computed by using the quantile function in (2.6) and the appropriate \(T\) and \(X\) distributions.
3 Some families of \(T\)-\(X\) distributions with different \(T\) distributions
The \(T\)-\(X\) family of distributions can be further classified into two sub-families: One sub-family has the same \(X\) distribution but different \(T\) distributions and the other sub-family has the same \(T\) distribution but different \(X\) distributions. For example, by letting \(T\) be a Weibull random variable, we generate a sub-family of Weibull-\(X\) distributions. By letting \(X\) be a Weibull random variable, we generate a sub-family of \(T\)-Weibull distributions. In this section, we consider the sub-family with different \(T\) distributions. Table 2 gives several such sub-families with the same \(X\) and different \(T\) random variables.
In each of the following sub-sections, we discuss the properties of the gamma-\(X\) family, beta-exponential-\(X\) family, and Weibull-\(X\) family.
3.1 Gamma-\(X\) family
If a random variable \(T\) follows the gamma distribution with parameters \(\alpha \) and \(\beta \), then \(r(t)={{(\Gamma (\alpha ){{\beta }^{\alpha }})}^{-1}}{{t}^{\alpha -1}}{{e}^{-t/\beta }},\,\,\,\,t>0\). From (2.5), the p.d.f. of gamma-\(X\) family is defined as
By using (2.4) and expressing the c.d.f. of the gamma distribution in terms of the incomplete gamma function, \(R(t)=( 1/\Gamma (\alpha ))\gamma (\alpha ,t/\beta )\), where \(\gamma (\alpha ,t)=\int _{0}^{t}{{{u}^{\alpha -1}}{{e}^{-u}}du}\), the c.d.f. of the gamma-\(X\) family in (3.1) is \(G(x)=\gamma \{ \alpha ,-\log (1-F(x))\}/\Gamma (\alpha )\). We will refer to distributions of the form \({{(F(x))}^{c}}\) and \({{(1-F(x))}^{c}}\) as, respectively, \(\text{ Exp }(F)\) and \(\text{ Exp }(1-F)\) family of distributions.
Lemma 1
The Shannon entropy of the gamma-\(X\) family of distributions is given by \({{\eta }_{X}}=-E\{ \log f({{F}^{-1}}( 1-{{e}^{-T}}))\}+\alpha (1-\beta )+\log \beta +\log \Gamma (\alpha )+(1-\alpha )\psi (\alpha )\), where \(\psi \) is the digamma function.
Proof
It follows from Theorem 1 by using \({{\mu }_{T}}=\alpha \beta \) and the Shannon entropy for the gamma distribution, which is given by Song [36] as \({{\eta }_{T}}\!=\!\alpha \!+\!\log \beta \!+\!\log \Gamma (\alpha )\!+\!(1-\alpha )\psi (\alpha )\). \(\square \)
When \(\alpha =1\), the gamma-\(X\) family in (3.1) reduces to \(\text{ Exp }(1-F)\) distributions. When \(\alpha = n\) and \(\beta = 1\), the gamma-\(X\) family is the density function of the \(n\)th upper record value arising from a sequence \(\{ {{X}_{i}}\}\) of identically independent random variables with the p.d.f. \(f(x)\) and c.d.f. \(F(x)\) (see Johnson et al. [19, p. 99]). The generalized gamma distribution defined by Amoroso [3] is a member of gamma-\(X\) family where \(X\) is the Weibull random variable. If \(f(x)\) is the p.d.f. of the Weibull distribution, then (3.1) becomes
Setting \(\delta =\beta {{\gamma }^{c}}\) in Eq. (3.2), the distribution reduces to the generalized gamma distribution in Amoroso [3]. When \(c=\gamma =1\), (3.2) reduces to the gamma distribution.
If \(f(x)\) is the p.d.f. of the Pareto distribution, then from (3.1) we get
On setting \(\beta /k=c\), we get
Based on our naming convention, the distribution in (3.3) will be called gamma-Pareto distribution. When \(\alpha = 1\), (3.3) reduces to the Pareto distribution and hence the gamma-Pareto distribution can be considered as a generalization of the Pareto distribution. Figure 1 shows graphs of the gamma-Pareto density for different parameter values including the special cases. The figure shows that the shape parameter \(\alpha \) adds extra flexibility to the distribution by changing the shape of the density function from reversed J-shape to concave down shape for certain parameter values.
The c.d.f. of the gamma-Pareto distribution in (3.3) is \(G(x)=\gamma \{ \alpha ,{{c}^{-1}}\log ( x/\theta ) \}/\Gamma (\alpha )\), and hence the quantile function of the gamma-Pareto distribution is the solution of equation \(G(x)=p,\,\,0\le p\le 1\). To investigate the effect of the two shape parameters \(\alpha \) and \(c\) on the gamma-Pareto density function, Eqs. (2.8) and (2.9) are used to obtain Galton’s skewness and Moors’ kurtosis. Figure 2 displays the Galton’s skewness and Moors’ kurtosis for the gamma-Pareto distribution in terms of the parameters \(\alpha \) and \(c\) when \(\theta =1.\)
From Fig. 2 and the corresponding data values (not included to save space), the Galton’s skewness is always positive which indicates that the gamma-Pareto distribution is right skewed. For fixed \(c\ge 1\), the Galton’s skewness is an increasing function of \(\alpha \). For fixed \(c < 1\), the Galton’s skewness is a decreasing function of \(\alpha \) and for fixed \(\alpha \), the Galton’s skewness is an increasing function of \(c\). The Moors’ kurtosis is an increasing function of \(\alpha \) and \(c\).
3.2 Beta-exponential-\(X\) family
If a random variable \(T\) follows the beta-exponential distribution in Nadarajah and Kotz [30], then \(r(t)=\lambda {{(B(\alpha ,\beta ))}^{-1}}{{e}^{-\lambda \beta t}}{{(1-{{e}^{-\lambda t}})}^{\alpha -1}}\). From (2.5), the p.d.f. of the beta-exponential-\(X\) family is defined as
The c.d.f. of (3.4) can be expressed in terms of the incomplete beta function \({{I}_{x}}(a,b)\). The c.d.f. of the beta-exponential-\(X\) family is \(G(x)=1-{{I}_{{{(1-F(x))}^{\lambda }}}}( \lambda (\beta -1)+1,\alpha )\).
Lemma 2
The Shannon entropy of the beta-exponential-\(X\) family of distributions is given by
Proof
It follows from Theorem 1 by using the mean \({{\mu }_{T}}=[\psi (\alpha +\beta )-\psi (\beta )]/\lambda \) and the Shannon entropy \({{\eta }_{T}}=\log ({{\lambda }^{-1}}B(\alpha ,\beta ) )+(\alpha +\beta -1)\psi (\alpha +\beta )-(\alpha -1)\psi (\alpha )-\beta \psi (\beta )\) for the beta-exponential distribution, which are given by Nadarajah and Kotz [30]. \(\square \)
Special cases of beta-exponential-\(X\) family:
-
(1)
The beta-generated family in (1.10) is a special case of (3.4) when \(\lambda = 1\). Hence, the family of distributions in (3.4) can be used to generate all the distributions belonging to the beta-generated family.
-
(2)
When \(\alpha =1\), the beta-exponential-\(X\) family reduces to the \(\text{ Exp }(1-F)\) distributions. When \(\beta =1\) and \(\lambda =1\), the beta-exponential-\(X\) reduces to the \(\text{ Exp }(F)\) distributions.
-
(3)
When \(\beta = 1\), (3.4) reduces to the exponentiated-exponential-\(X\) family with p.d.f.
The c.d.f of (3.5) can be written as \(G(x)={{\{ 1-{{( 1-F(x))}^{\lambda }}\}}^{\alpha }}\).
By using \(D(x)=1-F(x)\) in (3.5), the exponentiated-exponential-\(X\) family reduces to the \(KW\)-\(G\) family.
If \(X\) is the uniform random variable, then from (3.4) the beta-exponential-uniform is defined as
If we use the transformation \(y=1-x\) in (3.6) then the distribution reduces to the (i) generalized beta distribution of the first kind (McDonald [26]), when \(b = 1\), (ii) beta distribution when \(a = 0\) and \(b=\lambda =1\), and (iii) Kumaraswamy’s [25] double bounded distribution when \(a = 0\) and \(b=\beta =1\).
The exponentiated-Weibull distribution defined by Mudholkar et al. [28] is a member of exponentiated-exponential-\(X\) family in (3.5) when \(X\) is the Weibull random variable. If \(f(x)\) is the p.d.f. of the Weibull distribution, then (3.5) reduces to
Writing \(\delta =\lambda {{\gamma }^{c}}\), (3.7) reduces to the exponentiated-Weibull distribution given by Mudholkar et al. [28]. When \(\gamma =c= 1\), (3.7) reduces to the exponentiated-exponential distribution defined by Gupta and Kundu [17]. When \(\lambda =c=1\), (3.7) reduces to the Weibull distribution. When \(\lambda =\gamma =c=1\), (3.7) reduces to the exponential distribution.
The type I generalized logistic distribution given by Johnson et al. [20, p. 140], is a special case of exponentiated-exponential-logistic distribution. If \(f(x)\) is the p.d.f. of the standard logistic distribution then (3.5) reduces to
When \(\lambda =1\), the exponentiated-exponential-logistic distribution in (3.8) reduces to type I generalized logistic distribution. When \(\alpha = \lambda = 1\), (3.8) reduces to standard logistic distribution.
Figure 3 shows graphs of the exponentiated-exponential-logistic density functions for different parameter values including the special cases.
The c.d.f. of the exponentiated-exponential-logistic distribution in equation (3.8) is \(G(x)={{( 1-{{(1+{{e}^{x}})}^{-\lambda }})}^{\alpha }},\) and hence the quantile function of the exponentiated-exponential-logistic distribution can be written as
By using (3.9), (2.8) and (2.9), one can obtain the Galton’s skewness and the Moors’ kurtosis for the exponentiated-exponential-logistic distribution. Figure 4 displays the Galton’s skewness and Moors’ kurtosis for the exponentiated-exponential-logistic distribution in terms of the parameters \(\alpha \) and \(\lambda \).
From Fig. 4 and the corresponding data values (not included in order to save space), the exponentiated-exponential-logistic distribution can be left skewed, right skewed, and symmetric. For fixed \(\lambda >1\), the Galton’s skewness is an increasing function of \(\alpha \), and for fixed \(\alpha \), the Galton’s skewness is a decreasing function of \(\lambda \). For fixed \(\alpha \), the Moors’ kurtosis is a decreasing function of \(\lambda \) when \(\lambda >1\), and for fixed \(\lambda \), the Moors’ kurtosis is a decreasing function of \(\alpha \) when \(\alpha >1\).
3.3 Weibull-\(X\) family
If a random variable \(T\) follows the Weibull distribution with parameters \(c\) and \(\gamma \), then \(r(t)=( c/\beta ){{( t/\beta )}^{c-1}}{{e}^{-{{( t/\beta )}^{c}}}},\,\,\,t\ge 0\). From (2.5) the Weibull-\(X\) family is given by
The c.d.f. of the Weibull distribution is \(R(t)=1-{{e}^{-{{( t/\beta )}^{c}}}}\) and hence from (2.4) the c.d.f. of the Weibull-\(X\) family is
Lemma 3
The Shannon entropy of the Weibull-\(X\) family of distributions is given by
where \(\gamma \) is the Euler’s constant.
Proof
It follows from Theorem 1 by using the mean \({{\mu }_{T}}=\beta \,\Gamma (1+1/c)\) and the Shannon entropy \({{\eta }_{T}}=\gamma (1-1/c)-\log (c/\beta )+1\) for the Weibull distribution, which is given by Song [36]. \(\square \)
When \(c=1\), the Weibull-\(X\) family reduces to the \(\text{ Exp }(1-F(x))\) distributions. The type II generalized logistic distribution is a special case of Weibull-logistic distribution. If \(F(x)\) is the c.d.f. of the standard logistic distribution then (3.11) reduces to
When \(c =1\), the distribution in (3.12) reduces to type II generalized logistic distribution.
Figure 5 shows graphs of the Weibull-logistic density functions for different parameter values including the special case.
From (3.12), the quantile function of the Weibull-logistic distribution can be written as
Equations (3.13), (2.8) and (2.9) can be used to obtain Galton’s skewness and Moors’ kurtosis. Figure 6 displays the Galton’s skewness and Moors’ kurtosis for the Weibull-logistic distribution in terms of parameters \(\beta \) and c.
Figure 6 and the corresponding data values (not included to save space) indicate that the Weibull-logistic distribution can be left skewed, right skewed, and symmetric. For fixed \(\beta \), the Galton’s skewness is a decreasing function of \(c\), and for fixed \(c\), the Galton’s skewness is an increasing function of \(\beta \). For fixed \(c\), the Moors’ kurtosis is an increasing function of \(\beta \) when \(c\le 1\) and a decreasing function of \(\beta \) when \(c>1\).
4 Summary and conclusion
A method to generate new families of distributions is introduced. This technique defines new family of distributions using the composite function \((R\mathbf \huge . W\mathbf \huge . F)(x)\) with \(R\) and \(F\) being the c.d.f.s of the random variables \(T\) and \(X\), respectively. The \(W(.)\) function is defined to link the support of \(T\) to the range of \(X\). This technique generates a large number of new distributions as well as existing distributions as special cases. Table 1 contains several different variants of \(T\)-\(X\) families using different \(W(.)\) functions.
This article focuses on \(W(F(x))=-\log ( 1-F(x) )\), where the support of \(T\) is \([0,\,\infty )\). Some properties of this \(T\)-\(X\) family are studied. Besides using functions of moments for measuring skewness and kurtosis, we suggest Galton’s measure of skewness and Moors’ measure of kurtosis. Three sub-families of \(T\)-\(X\) family, namely gamma-\(X\) family, beta-exponential-\(X\) family and Weibull-\(X\) family are discussed. These sub-families demonstrate that the \(T\)-\(X\) family consists of many sub-families of distributions. Within each sub-family, one can define many new distributions as well as relate its members to many existing distributions.
Table 2 summarizes various sub-families based on different \(T\) distributions with the same \(X\) distribution. New distributions discussed include gamma-Pareto, exponentiated-exponential-logistic and Weibull-logistic distributions. In general, it is difficult to see how the shapes of the \(T\) and \(X\) distributions will affect the \(T\)-\(X\) distribution. We believe that a relationship may exist for some specific \(T\) and \(X\) distributions. For the gamma distribution, \(\alpha \) is a shape parameter while \(\beta \) is a scale parameter. For the Pareto distribution, \(\theta \) is a scale parameter and \(k\) is a shape parameter. After forming the gamma-Pareto distribution, \(\theta \) remains a scale parameter, \(\beta /k = c\) becomes a shape parameter. The study of the properties, parameter estimation and applications of these new distributions are currently under investigation. For example, Alzaatreh et al. [2] defined and studied the gamma-Pareto distribution, a member of the gamma-\(X\) family. Three real data sets were used to illustrate the applications of the gamma-Pareto distribution. The illustration showed that the gamma-Pareto distribution is a good model to fit data sets with various kinds of shapes.
Figure 7 provides a tree-relationship of the \(T\)-\(X\) family addressed in this article. As Fig. 7 shows, the \(T\)-\(X\) family consists of many sub-families, in which new distributions can be defined and various existing distributions are special cases.
The variants of \(T\)-\(X\) families in Table 1 will define many potential new distributions that deserve further study. Some of these variants are currently under investigation. Future research for the \(T\)-\(X\) family may include (i) the investigation of general properties of distributions generated using different \(W(.)\) functions, (ii) defining and investigating the properties of specific new distributions, (iii) studying new methods for estimating the parameters in addition to the well-known moments and maximum likelihood (ML) methods, and (iv) applying these new distributions to fit different types of data sets. Based on our experience, the ML method may be challenging for more than three parameters. A better estimation method will be needed for distributions with four or more parameters.
During the recent decade, many new distributions developed in the literature seem to focus on more general and flexible distributions. Using the technique that generates the \(T\)-\(X\) family, one can develop new distributions that may be very general and flexible or for fitting specific types of data distributions such as highly left-tailed (right-tailed, thin-tailed, or heavy-tailed) distribution as well as bimodal distributions. There are only a few existing distributions that are known to be capable of fitting bimodal shapes. One of such distributions that have been successfully applied to fit real world data sets is the beta-normal distribution by Eugene et al. [10] and Famoye et al. [11]. Our limited investigation in the \(T\)-\(X\) family suggests that there are new distributions that can fit not only unimodal and bimodal, but also multimodal distributions.
This article focuses on the case when both \(T\) and \(X\) are continuous random variables. This technique can be extended to develop discrete \(T\)-\(X\) family of distributions where \(T\) is continuous and \(X\) is discrete. Different considerations for the \(W(.)\) functions will be needed.
References
Akinsete, A., Famoye, F., Lee, C.: The beta-Pareto distribution. Statistics 42, 547–563 (2008)
Alzaatreh, A., Famoye, F., Lee, C.: Gamma-Pareto distribution and its applications. J. Mod. Appl. Stat. Methods 11(1), 78–94 (2012)
Amoroso, L.: Ricerche intorno alla curva dei redditi. Annali de Mathematica 2, 123–159 (1925)
Azzalini, A.: A class of distributions which includes the normal ones. Scand. J. Stat. 12, 171–178 (1985)
Burr, I.W.: Cumulative frequency functions. Ann. Math. Stat. 13, 215–232 (1942)
Cordeiro, G.M., de Castro, M.: A new family of generalized distributions. J. Stat. Comput. Simul. 81(7), 883–898 (2011)
Cordeiro, G.M., Ortega, E.M.M., Nadarajah, S.: The Kumaraswamy Weibull distribution with application to failure data. J. Frankl. Inst. 347, 1399–1429 (2010)
Cordeiro, G.M., Pescim, R.R., Ortega, E.M.M.: The Kumaraswamy generalized half-normal distribution for skewed positive data. J. Data Sci. 10, 195–224 (2012)
de Castro, M.A.R., Ortega, E.M.M., Cordeiro, G.M.: The Kumaraswamy generalized gamma distribution with application in survival analysis. Stat. Methodol. 8(5), 411–433 (2011)
Eugene, N., Lee, C., Famoye, F.: The beta-normal distribution and its applications. Commun. Stat. Theory Methods 31(4), 497–512 (2002)
Famoye, F., Lee, C., Eugene, N.: Beta-normal distribution: bimodality properties and applications. J. Mod. Appl. Stat. Methods 3(1), 85–103 (2004)
Famoye, F., Lee, C., Olumolade, O.: The beta-Weibull distribution. J. Stat. Theory Appl. 4(2), 121–136 (2005)
Ferreira, J.T.A.S., Steel, M.F.J.: A constructive representation of univariate skewed distributions. J. Am. Stat. Assoc. 101(474), 823–829 (2006)
Freimer, M., Kollia, G., Mudholkar, G.S., Lin, C.T.: A study of the generalized Tukey lambda family. Commun. Stat. Theory Methods 17, 3547–3567 (1988)
Fry, T.R.L.: Univariate and multivariate Burr distributions: a survey. Pak. J. Stat. Ser. A 9, 1–24 (1993)
Galton, F.: Enquiries into Human Faculty and its Development. Macmillan & Company, London (1883)
Gupta, R.D., Kundu, D.: Exponentiated-exponential family: an alternative to gamma and Weibull distributions. Biom. J. 43, 117–130 (2001)
Johnson, N.L.: Systems of frequency curves generated by methods of translation. Biometrika 36, 149–176 (1949)
Johnson, N.L., Kotz, S., Balakrishnan, N.: Continuous Univariate Distributions, vol. 1, 2nd edn. Wiley, New York (1994)
Johnson, N.L., Kotz, S., Balakrishnan, N.: Continuous Univariate Distributions, vol. 2, 2nd edn. Wiley, New York (1995)
Jones, M.C.: Families of distributions arising from distributions of order statistics. Test 13(1), 1–43 (2004)
Jones, M.C.: Kumaraswamys distribution: a beta-type distribution with tractability advantages. Stat. Methodol. 6, 70–81 (2009)
Karian, Z.A., Dudewicz, E.: Fitting Statistical Distributions—The Generalized Lambda Distribution and Generalized Bootstrap Methods. Chapman & Hall/CRC Press, Boca Raton (2000)
Kong, L., Lee, C., Sepanski, J.H.: On the properties of beta-gamma distribution. J. Mod. Appl. Stat. Methods 6(1), 187–211 (2007)
Kumaraswamy, P.: A generalized probability density functions for double-bounded random processes. J. Hydrol. 46, 79–88 (1980)
McDonald, J.B.: Some generalized functions for the size distribution of income. Econometrica 52, 647–663 (1984)
Moors, J.J.: A quantile alternative for kurtosis. Statistician 37, 25–32 (1988)
Mudholkar, G.S., Srivastava, D.K., Freimer, M.: The exponentiated Weibull family: a reanalysis of the bus-motor-failure data. Technometrics 37(4), 436–445 (1995)
Nadarajah, S., Kotz, S.: The beta Gumbel distribution. Math. Probl. Eng. 4, 323–332 (2004)
Nadarajah, S., Kotz, S.: The beta exponential distribution. Reliab. Eng. Syst. Saf. 91(6), 689–697 (2005)
Pearson, K.: Contributions to the mathematical theory of evolution. II. Skew variation in homogeneous material. Philos. Trans. Royal Soc. Lond. A 186, 343–414 (1895)
Ramberg, J.S., Schmeiser, B.W.: An approximate method for generating symmetric random variables. Commun. Assoc. Comput. Mach. 15, 987–990 (1972)
Ramberg, J.S., Schmeiser, B.W.: An approximate method for generating asymmetric random variables. Commun. Assoc. Comput. Mach. 17, 78–82 (1974)
Ramberg, J.S., Tadikamalla, P.R., Dudewicz, E.J., Mykytka, E.F.: A probability distribution and its uses in fitting data. Technometrics 21, 201–214 (1979)
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–432 (1948)
Song, K.-S.: Rényi information, log likelihood and an intrinsic distribution measure. J. Stat. Plan. Inference 93, 51–69 (2001)
Tukey, J.W.: The practical relationship between the common transformations of percentages of counts and amounts. Technical Report 36, Statistical Techniques Research Group, Princeton University, Princeton, NJ (1960)
Acknowledgments
The authors are grateful for the comments and suggestions by the referees and the Editor-in-Chief. Their comments and suggestions have greatly improved the paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Alzaatreh, A., Lee, C. & Famoye, F. A new method for generating families of continuous distributions. METRON 71, 63–79 (2013). https://doi.org/10.1007/s40300-013-0007-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40300-013-0007-y