Proportional incremental cost probability functions and their frontiers

Fève, Frédérique; Florens, Jean-Pierre; Simar, Léopold

doi:10.1007/s00181-023-02386-x

Proportional incremental cost probability functions and their frontiers

Published: 17 March 2023

Volume 64, pages 2721–2756, (2023)
Cite this article

Empirical Economics Aims and scope Submit manuscript

Frédérique Fève¹,
Jean-Pierre Florens¹ &
Léopold Simar ORCID: orcid.org/0000-0003-0791-8490^1,2

84 Accesses
Explore all metrics

Abstract

The econometric analysis of cost functions is based on the analysis of the conditional distribution of the cost Y given the level of the outputs $X\in {\mathbb {R}}_+^p$ and given a set of environmental variables $Z\in {\mathbb {R}}^d$. The model basically describes the conditional distribution of Y given $X\ge x$ and $Z=z$. In many applications, the dimension of Z is naturally large and a fully nonparametric specification of the model is limited by the curse of the dimensionality. Most of the approaches so far are based on two-stage estimations when the frontier level does not depend on the value of Z. But even in the case of separability of the frontier, the estimation procedure suffers from several problems, mainly due to the inherent bias of the estimated efficiency scores and the poor rates of convergence of the frontier estimates. In this paper we suggest an alternative semi-parametric model which avoids the drawbacks of the two-stage methods. It is based on a class of model called the Proportional Incremental Cost Functions (PICF), adapted to our setup from the Cox proportional hazard models extensively used in survival analysis for durations models. We define the PICF model, then we examine its properties and propose a semi-parametric estimation. By this way of modeling, we avoid the first stage nonparametric estimation of the frontier and avoid the curse of dimensionality keeping the parametric $\sqrt{n}$ rates of convergence for the parameters of interest. We are also able to derive $\sqrt{n}$-consistent estimator of the conditional order-m robust frontiers (which, by contrast to the full frontier, may depend on Z) and we prove the Gaussian asymptotic properties of the resulting estimators. We illustrate the flexibility and the power of the procedure by some simulated examples and also with some real data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Copula-based Stochastic Cost Frontier with Correlated Technical and Allocative Inefficiency

Article 10 March 2021

Individual Efficient Frontiers in Performance Analysis

How effective are we: towards a more convincing Stochastic Frontier analysis

Article 29 April 2021

Notes

Endogeneity means that the parameters of interest are not determined by the conditional distribution but by the joint distribution of Y and some variables. [see Cazals et al. (2016) or Simar et al. (2016)].
Of course, in practice we use individual bandwidths $h_{j}$ for each components of Z. So, in the notations that follow, $h^{d}$ has to be understood as $\prod _{j=1}^{d} h_{j}$. By doing so, and using product kernels, we are able to detect irrelevant components in the conditioning, see, e.g. Hall et al. (2004) and Li et al. (2013) for details.
This explains some abuse of language in this literature, where the partial frontiers are sometimes considered as robust versions of the full frontier. We try to avoid this confusion.
If a is normalized such that for some $(z_0,x_0)$, $a(z_0, \beta (x_0))=1$ the baseline model represents the cost process for this particular production unit.
In practice, the separability condition is an empirical issue, even if some argue that it may be a reasonable assumption in many situations for economic or technical reasons. In practice this assumption is easy to test, as described in Daraio et al. (2018) and Simar and Wilson (2020). In all the real data examples in Sect. 4.2 below, the test was applied and the separability assumption was not rejected.
We limit our presentation for the case of no ties in the $Y_{i}$ and no censoring which is mostly the case in our setup of cost efficiency analysis. The marginal likelihood can easily be extended to the case of ties and censored data (only the minimum between Y and some censoring value is observed). See, e.g. Kalbfleisch and Prentice (1980).
In the simple case where $a(z,\beta (x)) = e^{\beta '(x)z}$, the expression of $\ell (\beta (x))$ simplifies [see equation (4.6) in Kalbfleisch and Prentice (1980)] and explicit expressions for the gradient and the hessian can be derived.
Similar developments could be done for the conditional order-$\alpha $ frontiers.
As explained in the Appendix, we use the ${\widetilde{S}}$ notation for survivor functions when we condition to $X=x$, to distinguish form S where we condition on $X\ge x$.
Since $Q^{-1}(y,x,z)$ is specified, the value of y corresponding to a quantile $u\in [0,1]$ is given by $y=Q(u,x,z)$ and can be found numerically by solving $y=\arg \min _{y} | Q^{-1}(y,x,z) - u|$, which is easy since $Q^{-1}$ is monotone in y.

References

Aragon Y, Daouia A, Thomas-Agnan C (2005) Nonparametric frontier estimation: a conditional quantile-based approach. Econ Theory 21:358–389
Article Google Scholar
Bădin L, Daraio C, Simar L (2012) How to measure the impact of environmental factors in a nonparametric production model. Eur J Oper Res 223:818–833
Article Google Scholar
Cazals C, Florens JP, Simar L (2002) Nonparametric frontier estimation: a robust approach. J. Econom. 106(1):25
Article Google Scholar
Cazals C, Fève F, Florens JP, Simar L (2016) Nonparametric instrumental variables estimation for efficiency frontier. J Econom 190:349–359
Article Google Scholar
Charnes A, Cooper WW, Rhodes E (1981) Evaluating program and managerial efficiency: an application of data envelopment analysis to program follow through. Manag Sci 27:668–697
Article Google Scholar
Cox DR (1972) Regression models and life tables. JRSS B34:187–220
Google Scholar
Daouia A, Gijbels I (2011) Robustness and inference in nonparametric partial frontier modeling. J. Econom. 161:147–165
Article Google Scholar
Daraio C, Simar L (2005) Introducing environmental variables in nonparametric frontier models: a probabilistic approach. J. Prod. Anal. 24(1):93–121
Article Google Scholar
Daouia A, Simar L (2007a) Nonparametric efficiency analysis: a multivariate conditional quantile approach. J. Econom. 140:375–400
Daouia A, Simar L (2007b) Advanced robust and nonparametric methods in efficiency analysis: methodology and applications. Springer, New-York
Daouia A, Florens JP, Simar L (2010) Frontier estimation and extreme values theory. Bernoulli 16(4):1039–1063
Article Google Scholar
Daouia A, Florens JP, Simar L (2012) Regularization of non-parametric frontier estimators. J. Econom. 168:285–299
Article Google Scholar
Daraio C, Simar L, Wilson PW (2018) Central limit theorems for conditional efficiency measures and tests of the “Separability’’ condition in nonparametric, two-stage models of production. Econ J 21:170–191
Google Scholar
Florens JP, Simar L, Van Keilegom I (2014) Frontier estimation in nonparametric location-scale models. J Econom 178:456–470
Article Google Scholar
Grambsch PM, Therneau TM (1994) Proportional Hazards tests and diagnostics based on weighted residuals. Biometrika 81(3):515–526
Article Google Scholar
Hall P, Racine JS, Li Q (2004) Cross-validation and the estimation of conditional probability densities. J Am Stat Assoc 99(468):1015–1026
Article Google Scholar
Härdle WK, Simar L (2019) Applied multivariate statistical analysis, 5th edn. Springer, Switzerland
Book Google Scholar
Jeong SO, Park BU, Simar L (2010) Nonparametric conditional efficiency measures: asymptotic properties. Ann Oper Res 173:105–122
Article Google Scholar
Kalbfleisch JD, Prentice RL (1980) The statistical analysis of failure time data. Wiley, New York
Google Scholar
Kneip A, Simar L, Wilson PW (2015) When bias kills the variance: central limit theorems for DEA and FDH efficiency scores. Econom Theory 31:394–422
Article Google Scholar
Li Q, Lin J, Racine JS (2013) Optimal bandwidth selection for nonparametric conditional distribution and quantile functions. J Bus Econ Stat 31(1):57–65
Article Google Scholar
Mammen E (1992) When does bootstrap work? Asymptotic results and simulations. Springer, Berlin
Book Google Scholar
Park B, Simar L, Weiner Ch (2000) The FDH estimator for productivity efficiency scores: asymptotic properties. Econom Theory 16:855–877
Article Google Scholar
Simar L (2003) Detecting outliers in frontiers models: a simple approach. J Prod Anal 20:391–424
Article Google Scholar
Simar L, Wilson PW (2007) Estimation and inference in two-stage, semi-parametric models of production processes. J Econom 136(1):31–64
Article Google Scholar
Simar L, Wilson PW (2011) Two-stage DEA: caveat emptor. J Prod Anal 36:205–218
Article Google Scholar
Simar L, Wilson PW (2020) Hypothesis testing in nonparametric models of production using multiple sample splits. J Prod Anal 53:287–303
Article Google Scholar
Simar L, Vanhems A, Van Keilegom I (2016) Unobserved heterogeneity and endogeneity in nonparametric frontier estimation. J Econom 190:360–373
Article Google Scholar
Tibshirani R (1997) The Lasso method for variable selection in the Cox model. Stat Med 16:385–395
Article Google Scholar
Tsiatis AA (1981) A large sample study of Cox’s regression model. Ann Stat 9(1):93–108
Article Google Scholar
Wilson PW (1993) Detecting outliers in deterministic nonparametric frontier models with multiple outputs. J Bus Econ Stat 11:319–323
Google Scholar

Download references

Author information

Authors and Affiliations

Toulouse School of Economics (TSE), Toulouse, France
Frédérique Fève, Jean-Pierre Florens & Léopold Simar
Institut de Statistique, Biostatistique et Sciences Actuarielles (ISBA), LIDAM, UCLouvain, Louvain-la-Neuve, Belgium
Léopold Simar

Authors

Frédérique Fève
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Pierre Florens
View author publications
You can also search for this author in PubMed Google Scholar
Léopold Simar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Léopold Simar.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

F. Fève and J.P. Florens acknowledge funding from the French National Research Agency (ANR) under the Investments for the Future (Investissement d’Avenir), Grant ANR-17-EURE-0010.

A Appendix: Simulation of data

Simulating a data set $\{(X_{i}, Y_{i},Z_{i})\}_{i=1}^{n}$ should be done with care. Usually researchers specify a model for the frontier function $\varphi _{0}(x)$ then select a model to simulate values of $X_{i}$ and $Z_{i}$ and finally generate $Y_{i}$ for $X=X_{i}$ and $Z=Z_{i}$. Here we have to generate the sample according to our PICF model which specifies that the survival function $S(y | X\ge x, Z=z) = \left[ S_{0}(y|X\ge x)\right] ^{a(z,\beta (x))}$ for some basic survival function $S_{0}(y|X\ge x)$ and some given functions $a(\cdot ,\cdot )$ and $\beta (\cdot )$. So we need to recover from our model the conditional distribution of Y given $X=x$ and $Z=z$ derived from the PICF. We will denote by ${\widetilde{S}}(y |X=x, Z=z)$ this conditional survival function where we use the ${\widetilde{S}}$ notation when we condition on $X=x$, to distinguish from $S(y| X\ge x, Z=z)$ defined above, where we condition on $X\ge x$. Consider for instance the corresponding quantile function

$$\begin{aligned} y = Q(u,x,z) = {\widetilde{S}}^{-1}(y |X=x,Z=z), \end{aligned}$$

(A.1)

where Q is monotone decreasing with u. We know that $U=Q^{-1}(Y,x,z)$ is uniform on [0, 1] and independent of X and Z, so an easy way to simulate Y given $X=x$ and $Z=z$, is to generate $U_{i}$ as uniform on [0, 1] and then define $Y_{i}=Q(U_{i},X_{i},Z_{i})$.

The general form of Q(u, x, z) can be obtained as follows in order to satisfy the PICF model. Some simple algebra leads to the equation

$$\begin{aligned} \text {Prob}(Y \ge y \mid X\ge x, Z=z) =\frac{\int _{x}^{\infty } Q^{-1}(y,t,z) f_{X}(t | z) \textrm{d}t}{S_{X}(x|z)}. \end{aligned}$$

(A.2)

So, for the PICF model, the function Q must satisfy

$$\begin{aligned} \int _{x}^{\infty } Q^{-1}(y,t,z) f_{X}(t | z) \textrm{d}t = S_{X}(x|z) \left[ S_{0}(y|X\ge x)\right] ^{a(z,\beta (x))}. \end{aligned}$$

(A.3)

Taking the derivative with respect to x (with some abuse of notations below, for $x\in {\mathbb {R}}^{p}$ the derivative $\partial _{x}^{p}$ has to be understood as $\partial ^{p}/(\partial x_{1}\ldots \partial x_{p})$) and equating both sides we obtain

$$\begin{aligned} -Q^{-1}(y,x,z) f_{X}(x| z)= & {} - f_{X}(x| z) \left[ S_{0}(y|X\ge x)\right] ^{a(z,\beta (x))} \nonumber \\{} & {} + S_{X}(x|z) \partial _{x}^{p} \left\{ \big [S_{0}(y|X\ge x)\big ]^{a(z,\beta (x)} \right\} . \end{aligned}$$

(A.4)

After some tedious but simple mathematical developments, this leads to the equation

$$\begin{aligned} Q^{-1}(y,x,z)&= {\widetilde{S}}(y|X=x,Z=z) \nonumber \\&=\left[ S_{0}(y|X\ge x)\right] ^{a(z,\beta (x))} - {\frac{S_{X}(x| z)}{f_{X}(x| z)}\partial _{x}^{p} \left\{ \big [S_{0}(y|X\ge x)\big ]^{a(z,\beta (x)} \right\} }, \end{aligned}$$

(A.5)

which allows to define (at least numerically) its reciprocal Q(u, x, z) for any (u, x, z). The expression is greatly simplified if we introduce additional assumption in the model we want to simulate.

Indeed, if we assume that the joint conditional survival function satisfies the Cox model, i.e. $S_{XY}(x,y | z)= \left[ S_{0}(x,y)\right] ^{a(z,\beta (x))}$ where $S_{0}(x,y) = S_{0}(y|X\ge x) S_{0}(x)$, we have

$$\begin{aligned} S_{0}(y|X\ge x) = \frac{\int _{x}^{\infty } {\widetilde{S}}_{0}(y|t) f_{0}(t) \textrm{d}t}{S_{0}(x)}, \end{aligned}$$

(A.6)

where again ${\widetilde{S}}_{0}(y|x)$ is ${\widetilde{S}}_{0}(y|X=x)$, the correspondent of the baseline survivor $S_{0}(y|X\ge x)$ when conditioning on $X=x$. We also have $S_{X}(x|z)= (S_{0}(x))^{a(z,\beta )}$. Therefore, Eq. (A.3) simplifies into

$$\begin{aligned} \int _{x}^{\infty } Q^{-1}(y,t,z) f_{X}(t | z) \textrm{d}t = \left[ \int _{x}^{\infty } {\widetilde{S}}_{0}(y|t) f_{0}(t) \textrm{d}t \right] ^{a(z,\beta (x))}. \end{aligned}$$

(A.7)

In addition if we assume that $\beta (x)=\beta $, the derivative of both sides of (A.7) with respect to x simplifies. Note also that $f_{X}(x| z)= a(z,\beta ) f_{0}(x) (S_{0}(x))^{a(z,\beta )-1}$. After some simplifications this leads to the equation

$$\begin{aligned} Q^{-1}(y,x,z) =&{\widetilde{S}}(y|X=x,Z=z) \nonumber \\ =&\left[ S_{0}(y|X\ge x)\right] ^{a(z,\beta )-1} \widetilde{S}_{0}(y|X=x). \end{aligned}$$

(A.8)

So given the function $a(z,\beta )$, the survival $\widetilde{S}_{0}(y|X=x)$ and the baseline density of X, $f_{0}(x)$, we can compute $S_{0}(y|X\ge x)$ by (A.6), and then the conditional survival ${\widetilde{S}}(y|X=x,Z=z)$. By inverting (A.8), we have the quantile function $y=Q(u,x,z)$ for any u (at least numerically) and then we can simulate a value $Y_{i}$, for a given $(X_{i},Z_{i})$ according to the PICF model.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Fève, F., Florens, JP. & Simar, L. Proportional incremental cost probability functions and their frontiers. Empir Econ 64, 2721–2756 (2023). https://doi.org/10.1007/s00181-023-02386-x

Download citation

Received: 19 May 2022
Accepted: 02 February 2023
Published: 17 March 2023
Issue Date: June 2023
DOI: https://doi.org/10.1007/s00181-023-02386-x

Keywords

JEL Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Proportional incremental cost probability functions and their frontiers

Abstract

Access this article

Similar content being viewed by others

Copula-based Stochastic Cost Frontier with Correlated Technical and Allocative Inefficiency

Individual Efficient Frontiers in Performance Analysis

How effective are we: towards a more convincing Stochastic Frontier analysis

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

A Appendix: Simulation of data

Rights and permissions

About this article

Cite this article

Keywords

JEL Classification

Navigation

Proportional incremental cost probability functions and their frontiers

Abstract

Access this article

Similar content being viewed by others

Copula-based Stochastic Cost Frontier with Correlated Technical and Allocative Inefficiency

Individual Efficient Frontiers in Performance Analysis

How effective are we: towards a more convincing Stochastic Frontier analysis

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

A Appendix: Simulation of data

A Appendix: Simulation of data

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Search

Navigation