1 Introduction

Stochastic frontier analysis is an econometric methodology used for measuring production efficiency or cost efficiency of firms. Stochastic frontier analysis was introduced in Aigner et al. (1977) and Meeusen and Broeck (1977). A firm’s failure to produce the maximum possible output or failure to reach the minimum possible cost for a given level of output is termed as inefficiency score. Aigner et al. (1977) and Meeusen and Broeck (1977) used exponential distribution and half-normal distribution for the inefficiency score. Assuming exponential distributions or half-normal distributions for the inefficiency score have an advantage of getting a closed-form likelihood function. However, there are shortcomings with a half-normal probability distribution and an exponential distribution to be assumed for the inefficiency score. One limitation is that both probability distributions have a mode value of zero, which means the model assumes that firms are fully efficient. Broeck et al. (1994) noted that for some authors the zero mode of an exponential distribution is too strong of an assumption. Another shortcoming is that both a half-normal probability distribution and an exponential distribution are governed by a single parameter and therefore they are less flexible (Greene 1990). Furthermore, both the half-normal distribution and the exponential distribution have limitations because of their shape. Due to these limitations many other probability distributions have been assumed for the inefficiency part of a stochastic frontier model (SFM).

As alternative to exponential distribution and half-normal distribution for the inefficiency part of SFMs, different one-sided distributions have been proposed. For example, a truncated normal distribution in Stevenson (1980), Rayleigh distribution in Hajargasht (2015), and more recently generalized exponential distribution Papadopoulos (2021). These distributions have addressed some of the limitations of using an exponential distribution and a half-normal distribution. However, these new probability density functions are still less flexible.

In order make the inefficiency score more flexible various two parameter distributions are introduced. The gamma distribution is proposed in Stevenson (1980), Beckers and Hammond (1987) and Greene (2003). Griffin and Steel (2008) have used a flexible probability distribution of a generalized gamma distribution and a mixture of generalized gamma distribution. The shortcoming of using a generalized gamma distribution or a mixture of generalized gamma distributions for the inefficiency score is that they are not suitable for maximum likelihood estimation of a SFM and the estimation method is carried out using Bayesian methods (Griffin and Steel 2008). Meeusen and Broeck (1977) recommend the beta distribution for the efficiency level or beta standard exponential distribution for the inefficiency score. And they pointed out that assuming the efficiency level to follow a beta distribution led to insurmountable difficulty of finding a closed form likelihood function (Meeusen and Broeck 1977).

A numerical integration technique can circumvent the problem of getting a closed form likelihood function for flexible SFMs. Tsionas (2012) applied a numerical integration technique called a Fourier transform method and estimated a normal-beta standard exponential stochastic frontier method. Another numerical integration technique called a Monte Carlo (MC) method is used in Greene (2003) to derive a simulated likelihood function of a normal-gamma SFM. This article uses a Monte Carlo (MC) simulation technique to get a simulated likelihood function of a normal-beta exponential SFM.

The flexibility of a beta exponential distribution is one of its attractive features in using it as the probability density function of an inefficiency score. So far, I have not seen a SFM with the inefficiency score following a beta-exponential distribution. A beta exponential distribution has three parameters, and it contains another known probability distribution function as its special case. It generalizes an exponential distribution of Meeusen and Broeck (1977), one-parameter generalized exponential distribution of Papadopoulos (2021), beta standard exponential distribution of Tsionas (2012), exponentiated exponential distribution, and a weighted exponential distribution.

The following section presents the rationale for introducing the beta exponential distribution into stochastic frontier modeling. Section 3 discusses the mathematical properties of a beta exponential distribution. Section 4 is about the general framework of a SFM. Section 5 introduces the normal-beta exponential SFM. The simulated log-likelihood function of the normal-beta exponential SFM is in Section 6. And the JLMS inefficiency estimator is presented in Part 7. Data application of the normal-beta exponential SFM is in Section 8. Finally, Section 9 concludes.

2 Why beta exponential distribution?

This section presents some of the arguments for introducing a beta-exponential distribution into the mainstream of stochastic frontier modeling. The first argument for introducing a new probability distribution is based on the aphorism in statistics, also in econometrics, that all models are wrong, but some are useful. In short, the estimated efficiency level of a firm is dependent on the choice of probability distribution for the inefficiency score of a SFM. Thus, selecting appropriate probability distributions for the inefficiency score is crucial for accurate estimation of the SFM (Cheng et al. 2023; Lai and Huang 2010). Empirical studies have shown that different distributional assumptions on the inefficiency score gives different results (Cheng et al. 2023). Therefore, applied researchers on efficiency analysis are recommended to have a beta exponential distribution in their tool kit and verify its usefulness based on data.

Another compelling argument in support of the beta-exponential distribution in a stochastic frontier analysis is its flexibility. The beta-exponential distribution generalizes three probability distributions which are known in the literatures on stochastic frontier analysis and another two probability distributions which are not popular in the stochastic frontier literature. Of the previously considered probability distributions for the inefficiency score, the beta exponential distribution generalizes the exponential distribution used in Meeusen and Broeck, (1977), the one-parameter generalized exponential distribution in Papadopoulos (2021), and the beta standard exponential distribution in Tsionas (2012). Papadopoulos (2021) has noted that a generalized exponential distribution has the shortcoming of not containing the zero-mode distribution as its special case distributions. However, a beta exponential distribution addresses this problem by nesting the zero-mode case and other flexible probability distributions. The other two special case distributions are the exponentiated exponential distribution and the weighted exponential distribution.

Final important argument for introducing a beta exponential distribution into stochastic frontier analysis is that one of its special case distributions, a weighted exponential distribution, is both a flexible distribution and provides a closed-form likelihood function and a closed form inefficiency estimator. A weighted exponential distribution contains the one-parameter generalized exponential distribution, used in Papadopoulos (2021), as a special.

3 Beta exponential distribution

In stochastic frontier modeling, commonly the inefficiency score is assumed to follow a particular probability density function defined on the interval \((0,\infty )\). It is also possible to assume that the efficiency level follows a probability distribution defined on the interval \((\mathrm{0,1})\). In our case, we can use the beta exponential distribution of Naradajah and Kotz (2006) directly as the distribution of the inefficiency score. Alternatively, we can also assume the beta distribution for the efficiency level and derive the distribution of the inefficiency score using transformation techniques. Either way we ended up using the beta exponential distribution as the distribution of inefficiency score. In this section we assume the efficiency level to follow a beta distribution and use the transformation technique to get the density function of the inefficiency score, which is a beta exponential density function.

We can specify the relationship between the efficiency level and inefficiency score as

$$U={e}^{-u},$$

where \(U\in (\mathrm{0,1})\) is the efficiency level and \(u\ge 0\) is the inefficiency score. Under the SFM's cost function, the coefficient of inefficiency score \(u\) is positive. The efficiency level is defined as the ratio of actual production to the frontier (i.e., stochastic frontier) and the inefficiency score describes the level of output by which a firm is far from the frontier. If the inefficiency score is zero, \(u=0\), then the efficiency level is one, \(U=1\). An efficiency level of one indicates that the firm is fully efficient, or the firm is producing on the frontier. In general, the relationship between the efficiency level and the inefficiency score can be stated as:

$$\underset{u\to \infty }{\mathrm{lim}}{e}^{-u}=0\;\;\; \text {or}\;\;\; \underset{u\to 0}{\mathrm{lim}}{e}^{-u}=1$$

Tsionas (2012) assumed a beta distribution for the efficiency level, \(U={e}^{-u}\), where the distribution of \(u\) will be derived using the transformation technique of a random variable. Similarly, we can specify the efficiency level with additional parameter \(U={e}^{-\lambda u}\) and assume \(1-U\) to follow the beta distribution. Note that the function \(1-{e}^{-\lambda u}\) is the cumulative distribution function of the exponential distribution and assuming it to follow a beta distribution gives the beta exponential distribution.

The beta probability density function of a random variable \(U\) is given by

$$Beta(U;\; \alpha ,\beta )=\frac{1}{B\left(\alpha ,\beta \right)}{U}^{\alpha -1}(1-{U)}^{\beta -1}$$

where \(\alpha\) and \(\beta\) are shape parameters. \(B\left(\alpha ,\beta \right)\) is the beta function, \(B\left(\alpha ,\beta \right)=\frac{\Gamma (\alpha )\Gamma (\beta )}{\Gamma (\alpha +\beta )},\) and \(\Gamma (*)\) is the gamma function.

To derive the distribution of the inefficiency score \(u\), we take natural logarithms of efficiency level \(U\). That is

$$u=\frac{-\mathrm{ln}(1-U)}{\lambda }.$$

The inverse of the above natural logarithm function exists, and it is a one-to-one function. Therefore, we can use the transformation technique and derive a density function of inefficiency score \(u\).

The general formula for the transformation technique is

$$f\left(u\right)=|\frac{\partial {f}^{-1}(u)}{\partial u}|f(U({f}^{-1}(u))).$$

In our case

$$\frac{\partial \,{f}^{-1}\left(u\right)}{\partial u}=-\lambda {e}^{-\lambda u},$$

and

$$f(U({f}^{-1}(u)))=\frac{1}{B(\alpha ,\beta )}{e}^{-\lambda u(\alpha -1)}(1-{{e}^{-\lambda u})}^{\beta -1}.$$

Combining the above two function gives us the probability density function of the inefficiency score, \(u\).

$$f(u)=\frac{\lambda }{B\left(\alpha ,\beta \right)}{e}^{-\alpha \lambda u}(1-{{e}^{-\uplambda u})}^{\beta -1},$$
(3.1)

where \(\alpha >0\), \(\beta >0\), and \(\lambda >0\). \(B\left(\alpha ,\beta \right)\) is the beta function, \(B\left(\alpha ,\beta \right)=\frac{\Gamma (\alpha )\Gamma (\beta )}{\Gamma (\alpha +\beta )},\) and \(\Gamma (*)\) is the gamma function.

Following the approach proposed by Jones (2004) and Eugene et al. (2002), Naradajah and Kotz (2006) have also derived the beta exponential distribution. The mean and variance are provided in Naradajah and Kotz (2006), which are

$$E\left(u\right)=\frac{\psi \left(\alpha \right)-\psi \left(\alpha +\beta \right)}{\lambda },$$

and

$$Var\left(u\right)=\frac{{\psi }^{\left(1\right)}\left(\alpha \right)-{\psi }^{\left(1\right)}\left(\alpha +\beta \right)}{{\lambda }^{2}},$$

where \({\psi }^{1}\left(*\right)\) is the first-order derivative of the digamma function (Fig. 1).

Fig. 1
figure 1

Beta Exponential Distribution

The beta exponential distribution contains another known probability density function as its special case. If \(\alpha =1\), the above density degenerates to the exponentiated exponential distribution of Gupta and Kundu (2001). If \(\beta =2\) and reparametrizing \(\alpha =\frac{1}{a}\) and \(\lambda =ac\), we get a weighted exponential distribution with shape parameter \(a\) and scale parameter \(c\) (Gupta and Kundu 2009). If \(\alpha =1\) and \(\beta =2\), then we get the one parameter generalize exponential distribution used in Papadopoulos (2021). If \(\lambda =1\), then it becomes the beta standard exponential distribution of Tsionas (2012), and for \(\beta =1\) the distribution becomes a exponential distribution with rate parameter \(\alpha \lambda .\)

4 Stochastic frontier models

A production function SFM with inefficiency score \(u\) and statistical error \(v\) is specified as

$$y=f\left(x;b\right){e}^{v}{e}^{-u},$$

where \(v\in \left(-\infty ,\infty \right)\) and \(u\ge 0\)\(f\left(x;b\right)\) is a production function, \(y\) is output, \(x\) is vector of variable inputs, and \(b\) is an unknown vector of parameters. The production function \(f\left(x;b\right)\) can be either a Cobb-Douglas production function or any other production function that is linear after logarithmic transformation. Once the production function is specified, the next step introduces a probability distribution for the inefficiency score \(u\) and the statistical error term \(v\).

4.1 The likelihood function

The primary task in stochastic frontier modeling is to derive of the probability density function for the composite error term \(\varepsilon =v-u\). Assuming independence between \(v\) and \(u\), their joint distribution is given by

$$f\left(u,v\right)= f\left(u\right)f\left(v\right),$$

where \(f\left(u\right)\) and \(f\left(v\right)\) are the probability density functions of \(u\) and \(v,\) respectively. Next, we substitute one of the random variables from the composite error \(\varepsilon =v-u\), and integrate the remaining variable. Hence,

$$f(\varepsilon )=\int_{0}^{\infty }f\left(u\right)f\left(\varepsilon +u\right)du.$$

For some choices of the probability density functions of \(u\) and \(v\), the above integral can be computed, and we have a closed-form distribution function of \(\varepsilon\).

Once the probability density function of \(\varepsilon\) is derived, then the likelihood function of the SFM for \(N\) observation is,

$$L\left(\theta \mid \varepsilon \right)=\prod\limits_{i=1}^{N}f(\varepsilon \mid \theta ),$$

where \(\varepsilon =v-u=\mathrm{log}\left(y\right)-\mathrm{log}[f\left(x;b\right)].\)

5 Normal-beta exponential SFM

In this paper we will assume that \(v\) follows a normal distribution and \(u\) follows a beta-exponential distribution, therefore, we will call this a “normal-beta exponential SFM”. The probability density function (pdf) of inefficiency score, \(u,\) is given in Eq. (3.1). And the statistical error, \(v,\) is assumed to follow a normal distribution with mean \(0\) and constant variance \({\sigma }^{2}\),

$$f\left(v\right)=\frac{1}{\sqrt{2\pi {\sigma }^{2}}}{e}^{-\frac{{v}^{2}}{{2\sigma }^{2}}},$$
(5.1)

where \({\sigma }^{2}>0\).

Assuming independence between the inefficiency score, \(u,\) and statistical error, \(v\), the joint density function is given by the product of Eqs. (3.1) and (5.1). By substituting \(v=\varepsilon +u\), and integrating the joint density function gives us the marginal density function of the composite error \(\varepsilon\). That is,

$$f\left(\varepsilon \right)=\underset{0}{\overset{\infty }\int}\frac{{e}^{-\alpha \lambda u}(1-{{e}^{-\uplambda u})}^{\beta -1}}{B\left(\alpha ,\beta \right)}\frac{\lambda }{\sqrt{2\pi {\sigma }^{2}}}{e}^{-\frac{{(\varepsilon +u)}^{2}}{{2\sigma }^{2}}}du.$$

Let us relabel \(u\) to \(q\), then the above integral can be simplified as,

$$\begin{aligned}f\left(\varepsilon \right)=&\;\Phi \left(-\frac{(\varepsilon +{\sigma }^{2}\alpha \lambda )}{\sigma }\right)\frac{\lambda {e}^{\varepsilon \alpha \lambda +.5{\sigma }^{2}{\alpha }^{2}{\lambda }^{2}}}{B\left(\alpha ,\beta \right)}\\&\underset{0}{\overset{\infty }{\int }}\frac{(1-{{e}^{-\lambda q})}^{\beta -1}}{\Phi \left(-\frac{(\varepsilon +{\sigma }^{2}\alpha \lambda )}{\sigma }\right)\sqrt{2\pi {\sigma }^{2}}}{e}^{-\frac{{\left(q+\left(\varepsilon +{\sigma }^{2}\alpha \lambda \right)\right)}^{2}}{{2\sigma }^{2}}}dq.\end{aligned}$$

The integral part in the above distribution function can be viewed as the expectation over a right truncated normal random function. Which is,

$${\rm E}\left[{\left(1-{e}^{-\lambda q}\right)}^{\beta -1}\right]=\underset{0}{\overset{\infty }{\int }}(1-{{e}^{-\lambda q})}^{\beta -1}f(q)dq,$$

where \(q\) is a truncated normal random variable and \(f(q)\) is a right truncated normal distribution.

Then the density function of the composite error term \(\varepsilon\) becomes

$$f\left(\varepsilon \right)=\frac{\lambda {e}^{\varepsilon \alpha \lambda +.5{\sigma }^{2}{\alpha }^{2}{\lambda }^{2}}}{B\left(\alpha ,\beta \right)} \Phi \left(\frac{-(\varepsilon +{\sigma }^{2}\alpha \lambda )}{\sigma }\right){\rm E}\left[{\left(1-{e}^{-\lambda q}\right)}^{\beta -1}\right],$$
(5.2)

where \(\alpha >0,\beta >0,\) \(\sigma >0\), \(\lambda >0\), and \(q\sim {N}^{+}(-(\upvarepsilon +{\sigma }^{2}\alpha \lambda ), {\sigma }^{2}).\)

The mean of the composite error \(\varepsilon\) is,

$${\rm E}\left[\varepsilon \right]={\rm E}\left[u\right]=\frac{\psi \left(\alpha \right)-\psi \left(\alpha +\beta \right)}{\lambda }.$$

and the variance is,

$$Var\left[\varepsilon \right]=Var\left[v\right]+Var[u]={\sigma }^{2}+\frac{{\psi }^{\left(1\right)}\left(\alpha \right)-{\psi }^{\left(1\right)}\left(\alpha +\beta \right)}{{\lambda }^{2}}.$$

The log-likelihood function for \(N\) observation is

$$\begin{aligned}\mathrm{log}\,L\left(\alpha ,\beta ,\sigma ,\lambda \mid {\varepsilon }_{i}\right)=&\;\sum_{i=1}^{N}{\varepsilon }_{i}\alpha \lambda +\frac{N{\sigma }^{2}{\alpha }^{2}{\lambda }^{2}}{2}+N\left\{\mathrm{log}\left[\lambda \right]\right\}\\&-N\left\{\mathrm{log} \left[B\left(\alpha ,\beta \right)\right]\right\}\\&+\mathrm{log}\left\{\sum_{i = 1}^{N}\Phi \left[\frac{-\left({\varepsilon }_{i}+{\sigma }^{2}\alpha \lambda \right)}{\sigma }\right]\right\}\\&+\mathrm{log}\left\{\sum_{i = 1}^{N}{\rm E}\left[{\left(1-{e}^{-\lambda {q}_{i}}\right)}^{\beta -1}\right]\right\}.\end{aligned}$$

We can get the log-likelihood function for the special case SFMs by restricting parameter values from a log-likelihood function of a normal-beta exponential stochastic frontier model. If \(\beta =2\) and reparametrizing \(\alpha =\frac{1}{a}\) and \(\lambda =ac\), we get a closed form likelihood function of a normal-weighted exponential SFM, which is

$$\begin{aligned}\mathrm{log}\,L\left(a,c,\sigma ,\lambda \mid {\varepsilon }_{i}\right)=&\;\sum_{i=1}^{N}{\varepsilon }_{i}\lambda +\frac{N{\sigma }^{2}{c}^{2}}{2}+N\left\{\mathrm{log}\left[c\right]\right\}\\&-N\left\{\mathrm{log} \left[a+1\right]\right\}-N\{\mathrm{log}[a]\}\\&+\mathrm{log}\Bigg\{\sum_{i = 1}^{N}\Phi \left[\frac{-\left({\varepsilon }_{i}+{\sigma }^{2}c\right)}{\sigma }\right]\\&\;\;\;\;\;\;\;\;\;\;\;-\Phi \left[\frac{-\left({\varepsilon }_{i}+{\sigma }^{2}c\left(a+1\right)\right)}{\sigma }\right]{e}^{{\varepsilon }_{i}ca+.5{\sigma }^{2}{c}^{2}\left({a}^{2}+2a\right)}\Bigg\}.\end{aligned}$$

6 Maximum simulated likelihood

Since the log-likelihood function of a normal-beta exponential SFM is not in closed form we need to use a simulated log-likelihood function. The integral function within the likelihood function of a normal-beta exponential stochastic frontier model is,

$${\rm E}\left[{\left(1-{e}^{-q\lambda }\right)}^{\beta -1}\right]=\int_{0}^{\infty }\left[{\left(1-{e}^{-\lambda q}\right)}^{\beta -1}\right]f\left(q\right)dq,$$

where \(f\left(q\right)\) is the probability density function of a truncated normal distribution.

The numerical integration part is to approximate the integral function using random number drawings. That is,

$${\rm E}\left[{\left(1-{e}^{-q\lambda }\right)}^{\beta -1}\right]=\frac{1}{R}\left(\sum\limits_{r = 1}^{R}\left[{\left(1-{e}^{-\uplambda {q}_{\mathrm{r}}}\right)}^{\beta -1}\right]\right),$$

where \({q}_{\mathrm{r}}\) is \({r}^{th}\) draw from a truncated normally distributed \(q\) and \(R\) is the maximum number of random draws. Random numbers from the truncated normally distributed random variable \(\mathrm{q}\) are drawn using a Monte Carlo (MC) technique, the inverse transform method. The inverse transform method is that if \(q\) has a cumulative distribution function \(F(q)\) and Z follows a uniform distribution, then \({F}^{-1}(Z)\) follows a truncated normal distribution. Greene (2003) and Geweke et al. (1997) have a broader presentation about the use of the inverse transform methods and sampling from a truncated normal distribution.

The cumulative distribution function of \(q\) is given by,

$$F\left(q\right)=\frac{\Phi \left[\frac{q\mathop{+}\left(\varepsilon \mathop{+}{\sigma }^{2}\alpha \lambda \right)}{\sigma }\right]\mathop{-}\Phi \left(\frac{\varepsilon \mathop{+}{\sigma }^{2}\alpha \lambda }{\sigma }\right)}{\Phi \left(\mathop{-}\frac{\varepsilon \mathop{+}{\sigma }^{2}\alpha \lambda }{\sigma }\right)}.$$

And the inverse of the cumulative distribution function of \(q\) is

$$\begin{aligned}q=&-\left(\varepsilon +{\sigma }^{2}\alpha \lambda \right)+\sigma {\Phi }^{-1}\left[\Phi \left(\frac{\varepsilon +{\sigma }^{2}\alpha \lambda }{\sigma }\right)+\Phi \left(-\frac{\varepsilon +{\sigma }^{2}\alpha \lambda }{\sigma }\right)Z\right].\end{aligned}$$

Uniform random draws can be either from the pseudo-random generator or quasi-random generator. Pseudo-random numbers have the disadvantage of having high discrepancy. Quasi-random points, however, have a low discrepancy, and they are efficient for numerical integration methods. The Halton sequences attempt to place sample points nearly uniformly in space and are an example of a quasi-random number sequence. Therefore, Halton sequences are used to generate points for Monte Carlo simulations of the normal-beta exponential likelihood function.

The simulated likelihood function of the normal-beta exponential SFM is

$$\begin{aligned}\mathrm{log}\,L\left(\alpha ,\beta ,\sigma ,\lambda \mid {\varepsilon }_{i}\right)=&\sum_{i=1}^{N}{\varepsilon }_{i}\alpha \lambda +\frac{N}{2}{\sigma }^{2}{\alpha }^{2}{\lambda }^{2}+N[\mathrm{log}\left(\lambda \right)]\\&-N\left\{\mathrm{log}\left[B\left(\alpha ,\beta \right)\right]\right\}\\&+\mathrm{log}\left[\sum_{i = 1}^{N}\Phi \left(-\frac{{\varepsilon }_{i}+{\sigma }^{2}\alpha \lambda }{\sigma }\right)\right]\\&+\mathrm{log}\left(\sum_{i = 1}^{N}\left\{\frac{1}{R}\sum_{r = 1}^{R}\left[{\left(1-{e}^{-{\lambda q}_{i,\mathrm{r}}}\right)}^{\beta -1}\right]\right\}\right),\end{aligned}$$

where \({q}_{i,r}={\mu }_{i}+\sigma {\Phi }^{-1}\left[\Phi \left[\frac{-{\mu }_{i}}{\sigma }\right]+{Z}_{i,r}\left(1-\Phi \left[\frac{-{\mu }_{i}}{\sigma }\right]\right)\right]\), \(\Phi (*)\) is the standard normal cumulative distribution function, \({\mu }_{i}=-({\varepsilon }_{i}+{\sigma }^{2}\alpha \lambda )\), and \({Z}_{i,r}\) is a Halton sequence for each observation.

7 JLMS inefficiency estimator

Estimating the efficiency level of a firm or an agent is the primary reason for using a SFM. The Jondrow Lovell Materov Schmidt (JLMS) inefficiency estimator, named after the work of Jondrow et al. (1982), is widely used. The JLMS inefficiency estimator corresponds to the mean of the conditional probability density function of the inefficiency score \(u,\) evaluated at maximum simulated likelihood estimator of the SFM. The conditional probability density function of \(u\), \(f\left(u\mid \varepsilon \right)\), is defined as the ratio of the joint probability density of \(u\) and \(\varepsilon\), \(f\left(\varepsilon ,u\right),\) to the marginal probability density of \(\varepsilon\), \(f\left(\varepsilon \right)\), i.e.,

$$f\left(u\mid \varepsilon \right)=\frac{f\left(\varepsilon ,u\right)}{f\left(\varepsilon \right)}.$$

We have the joint probability density function of \(u\) and \(\varepsilon\) as,

$$f\left(\varepsilon ,u\right)=\frac{\lambda }{B\left(\alpha ,\beta \right)}{e}^{-\alpha \lambda u}(1-{{e}^{-\lambda u})}^{\beta -1}\frac{1}{\sqrt{2\pi {\sigma }^{2}}}{e}^{-\frac{{\left(\varepsilon +u\right)}^{2}}{{2\sigma }^{2}}}.$$
(7.1)

The marginal probability distribution of the composite error \(\varepsilon\) is given in Eq. (5.2). And by dividing equation Eq. (7.1) by Eq. (5.2), we get a conditional distribution of the inefficiency score,

$$f\left(u\mid \varepsilon \right)=\frac{(1-{{e}^{-\lambda u})}^{\beta -1}}{{\rm E}[(1-{{e}^{-\lambda q})}^{\beta -1}] }\frac{1}{\Phi \left(-\frac{\left(\varepsilon +{\sigma }^{2}\alpha \lambda \right)}{\sigma }\right)}\frac{{e}^{-\frac{{\left(u+\left(\varepsilon +{\sigma }^{2}\alpha \lambda \right)\right)}^{2}}{2{\sigma }^{2}} }}{\sqrt{2\pi {\sigma }^{2}}} .$$

Relabel \(\widehat{u}\) to \(\widehat{q}\) and the JLMS inefficiency estimator, \(E[\hat{u} \;| \; \hat{\varepsilon}]\), of a normal-beta exponential SFM is,

$$E[\hat{q} \;| \; \hat{\varepsilon}]=\frac{E[\widehat{q}(1-{{e}^{-\widehat{\lambda }\widehat{q}})}^{\widehat{\beta }-1}]}{{\rm E}[(1-{{e}^{-\widehat{\lambda }\widehat{q}})}^{\widehat{\beta }-1}] },$$

where \(\widehat{q}\) is a draw from a truncated normal distribution, and the mean and variance of the normal distribution before truncation is, \(-\left(\widehat{\varepsilon }+\widehat{{\sigma }^{2}}\widehat{\alpha }\widehat{\uplambda }\right)\) and \(\widehat{{\sigma }^{2}}\), respectively.

For a normal-weighted exponential SFM we can have a closed form JLMS inefficiency estimator. From a beta exponential distribution, Eq. (3.1), if we restrict the shape parameter \(\beta =2\) and reparametrizing \(\alpha =\frac{1}{a}\) and \(\lambda =ac\), then we get the conditional distribution of an inefficiency score \(u\) as,

$$f\left(u\mid \varepsilon \right)=\frac{\left(1-{e}^{-cau}\right)}{ \Omega }\frac{{e}^{-\frac{{\left(u-\mu \right)}^{2}}{2{\sigma }^{2}} }}{\sqrt{2\pi {\sigma }^{2}}} ,$$

where \(\Omega =\Phi \left(\frac{\mu }{\sigma }\right)-\)\(\Phi \left(\frac{{\mu }_{*}}{\sigma }\right){e}^{-\varepsilon ca+0.5{\sigma }^{2}{c}^{2}\left({a}^{2}+2a\right)}\), \(\mu = \varepsilon -{\sigma }^{2}c\), and \({\mu }_{*}= \varepsilon -{\sigma }^{2}c\left(a+1\right)\). The above conditional probability density function is a weighted sum of a truncated normal distribution. Thus, the JLMS inefficiency estimator of a normal-weighted exponential SFM is in closed form,

$$\begin{aligned}E[\hat{u} \;| \; \hat{\varepsilon}]=&\; \frac{\Phi \left(\frac{\widehat{\mu }}{\widehat{\sigma }}\right)}{\widehat{\Omega }}\left(\widehat{\mu }+\widehat{\sigma }\phi \left(\frac{\widehat{\mu }}{\widehat{\sigma }}\right)\right)\\&-\left(\widehat{{\mu }_{*}}+\widehat{\sigma }\phi \left(\frac{\widehat{{\mu }_{*}}}{\widehat{\sigma }}\right)\right)\frac{\Phi \left(\frac{\widehat{{\mu }_{*}}}{\widehat{\sigma }}\right)}{\widehat{\Omega }}{e}^{-\widehat{\varepsilon }\widehat{c}\widehat{a}+0.5\widehat{{\sigma }^{2}}{\widehat{c}}^{2}\left(\widehat{{a}^{2}}+2\widehat{a}\right)}.\end{aligned}$$

8 Real data application

In this section, we estimate the normal-beta exponential SFM and its special case models using a manufacturing firm data. A likelihood ratio test is used to test nested models against a normal-beta exponential SFM. Real data application is based on the same functional form of a regression function and data set used in Greene (2003). In Greene (2003), the inefficiency score is assumed to follow a gamma distribution but now we assume the inefficiency score to follow a beta exponential distribution.

The functional form of the SFM is

$$\begin{aligned}\;\mathrm{log}\left(\frac{C}{pf}\right)=&\;{b}_{1}+{b}_{2}\;\mathrm{log}\left(\frac{pl}{pf}\right)\\&+{b}_{3}\;\mathrm{log}\left(\frac{pk}{pf}\right)+{b}_{4}\;\mathrm{log}(Y)\\&+{b}_{5}{\;\mathrm{log}}^{2}(Y)+\varepsilon ,\end{aligned}$$

where \(C\) is total cost of generation, \(pf\) is unit price of fuel, \(pl\) is unit price of labor, \(pk\) is unit price of capital, and \(Y\) is output. The error component \(\varepsilon\) is a composite of the statistical error \(v\), and the inefficiency part \(u\). While the statistical error \(v\), is assumed to follow a normal distribution with zero mean and constant variance, the inefficiency part \(u\) is assumed to follow a beta exponential distribution.

Maximum simulated likelihood estimates of the normal-beta exponential SFM are presented below in Table 1. The parameters listed under the first column are the coefficients of the cost function and parameters of the composite error term. The remaining columns are estimates under various SFMs. Only with a normal-beta exponential SFM, all estimates are statistically significant at one percent level of significance. And using a likelihood ratio test we reject, at one percent level of significance, an exponential distribution, a generalized exponential distribution, a weighted exponential distribution in favor of a beta exponential distribution. Again, using a likelihood ratio test, we reject the exponentiated exponential distribution at five percent level of significance in favor of a beta-exponential distribution. For the beta standard exponential distribution, the estimate of the shape parameter is not statistically significant. Therefore, we can conclude that, a normal-beta exponential SFM fits the data better than special case SFMs.

Table 1 Normal-Beta Exponential Stochastic Frontier Models and Its Special Cases

The main objective of estimating a SFM is to get an estimated efficiency level of each firm. Descriptive statistics of JLMS estimated efficiency levels, for six SFMs, are presented in Table 2 below. Looking at the standard deviation of the estimated efficiency level, we can observe that a normal-beta exponential SFM and a normal-beta standard exponential SFM present the highest variation in the JLMS estimated efficiency levels of each firm. Moreover, it can be shown that the highest estimated efficiency level, which is 98.56%, is obtained under a normal-exponential SFM, while the lowest efficiency estimate is in a normal-beta exponential SFM, which is 75.14% efficiency.

Table 2 Descriptive statistics of JLMS estimated efficiency levels

Figure 2 shows an estimated probability density function of the JLMS estimated efficiency level of all firms, under a normal-beta exponential SFM.

Fig. 2
figure 2

Estimated efficiency level under a normal-beta exponential SFM

9 Conclusion

Beta exponential distribution is introduced as a distribution of inefficiency score in a SFM. A normal-beta exponential SFM is much more flexible compared to other SFMs and it nests many other flexible SFMs. Three of these special case SFMs are known in the literature and they are the normal-exponential SFM of Meeusen and Broeck (1977), the normal-generalized exponential SFM of Papadopoulos (2021), and the normal-beta standard exponential SFM of Tsionas (2012). And the other two new SFMs are a normal-exponentiated exponential SFM and a normal-weighted exponential SFM. We derived a simulated log-likelihood function and a JLMS inefficiency estimator of a normal-beta exponential SFM. In addition, we have also presented a closed form log-likelihood function and a closed form JLMS inefficiency estimator of a normal-weighted exponential SFM. An empirical study of a cost function is conducted using a normal-beta exponential SFM. Using a likelihood ratio test, we have compared a normal-beta exponential SFM with its nested models. The estimated result show that a normal-beta exponential SFM performs better than its special case models. The same cost function and data set used in Greene’s (2003) normal-gamma SFM is used, however a normal-beta exponential SFM estimates the parameters with a lower standard error.