1 Introduction

Prevailing targets of the 17 Sustainable Development Goals of the 2030 Agenda for Sustainable Development (United Nations 2015) are to eradicate all forms of poverty, boost shared prosperity, build resilient societies and promote sustainable use of natural resources. With world population projected to attain 8.6 billion by 2030 and 9.8 billion by 2050 (IFPRI 2017), concerns about food security, economic growth and environmental sustainability have been to the fore of discussions and have elevated the attention to calls for global food and environmental security.

With fresh water touching nearly every aspect of development, and environmental challenges like water scarcity threatening the global community, a common objective of many development projects is to promote water security. With expanding population and increasing incomes, the growing demand for food will intensify the use of water in agriculture. At the same time, the supply of water is becoming more unpredictable due to climate change, and competition among users in non-agricultural sectors is rising. Energy and sanitization needs in expanding cities and rising food demand will exacerbate global water stress, which can hinder economic growth and shared prosperity.

With the agricultural sector being the largest user of freshwater, sustainable agricultural policies meeting the growing demand for food by avoiding depletion of water should be at the core of any discussions about how water is used in agriculture in terms of sustainability, productivity and equity (Molden and Oweis 2007). Development projects related to sustainable agricultural water management practices have shown that shifting to more productive, water-saving technologies is the cornerstone to achieving effective use of agricultural water (IFPRI 2017; World Bank 2017; United Nations 2015; FAO 2012).

Funding projects can help countries enhance agricultural water management through the improvement of water infrastructures, changes in the policy environment, and the provision of technological innovations and training such as improved irrigation technologies along with on-farm extension services. Irrigation projects can promote development in two ways: (i) to foster output growth through sustainable patterns of production such as improved irrigation technologies, and (ii) to directly support farmers through the provision of extension services that aim to enhance managerial ability and therefore technical efficiency.

The interest in promoting food and water security through development projects has led to the need for tools that can evaluate the impact of these projects, and to ensure that the projects reach the most vulnerable and enhance the quality of life for the members of the community (Gertler 2011). Most of the impact evaluation studies are based on randomized or experimental designs where a control group or counterfactual is compared with a group of beneficiaries. The ideal control group shares the same characteristics as the beneficiaries with the difference that the latter group participated in the intervention while the first one did not. After collecting and analyzing data, researchers can proceed with impact evaluations of development projects and provide empirical documentation on the progress toward the performance of a program, and the attainment of sustainable production practices and efficient use of water. However, one of the challenges with this framework is the construction of a credible counterfactual (Duflo et al. 2007).

For the randomized control trial framework, baseline data before the implementation of the program and data for the same individuals toward the end of the program are needed. However, there are cases where only cross section endline data are available for both the control and the treatment groups. In this case, the analysis based on panel data sets of randomized designs cannot be followed. However by ensuring randomization, impact evaluation analysis based on cross-sectional data can provide unbiased measures of impact on the performance of those who received the intervention.

There are a few applications where production frontier models have been used as the methodological framework to assess the impact of development projects that are designed to promote output growth in a sustainable way while improving farmers’ managerial abilities. Given the link between performance and technical efficiency (Mundlak 1961), production frontier frameworks have been used broadly in the last decades to measure productivity and technical efficiency in a wide range of economic activities, including agriculture, educational organizations, hospitals, banking, etc. However, the weakness for most of these studies is that selectivity bias, which can be an important issue in the development projects, is not taken into consideration. Recent literature has shown that stochastic production frontier models can be adapted to accommodate selectivity bias when impact evaluation of development projects is studied (Bravo-Ureta 2014).

The contribution of this study is to add to the growing impact evaluation literature by comparing farm-level technical efficiency scores across adopters and non-adopters using stochastic production frontier models, after addressing possible self-selectivity issues. When the decision to adopt a new agricultural technology or participate in a development program is based on farmers’ own perceptions about the effectiveness and profitability of this decision, then self-selectivity can affect the impact evaluation analysis. Our empirical application uses a set of 56 greenhouse vegetable produces in Crete (Greece) for the cropping years from 2009-2010 to 2012-2013. Integrating the stochastic production frontier framework accommodating corrections for self-selectivity biases as it is applied to a data set that explores the impact of interventions related to new irrigation technologies on the allocation of scarce resources, offers an opportunity to accurately measure farm-level economic performance. This study can help policymakers understand how different technological adoption strategies can affect the performance of farmers who received the intervention (like a new irrigation technology in this study) and how effective this adoption can be.

The remainder of this study is organized into five sections. The second section presents the literature review, while the third section introduces the conceptual framework followed by a description of the data employed in the empirical models. The estimation results of the empirical models are presented and discussed in section 5, while the last section offers a set of concluding remarks and policy implications.

2 Literature Review

Modern microeconomic theory tends to treat firms as efficient operators minimizing the cost of inputs used to produce a given level of output. In reality, however, producers might not always behave optimally. The measure of discrepancy between what is optimal and what is observed is attributed to inefficiency. Many empirical studies find the presence of inefficiency across many sectors, despite the fact that economic theory is based on the rationale that competition drives inefficient firms out of business. The presence of inefficiency and its intensity in a sector can be explained by a set of exogenous variables that is outside the firm’s control. In the stochastic frontier literature, efficiency is measured by scores that are assumed to follow a pre-truncated distribution with a specific mean and variance.

The impact evaluation literature of agricultural projects progressed to investigate the factors that can capture and explain differences in the performance, and therefore technical efficiency, of new technologies’ adopters. Dinar et al. (2007) is an early contribution that combines the stochastic production frontier framework with the impact evaluation literature. This paper attempted to evaluate the impact of extension services on the performance of farmers, but they did not address the selectivity bias issue which can lead to estimated efficiency scores that are biased and inconsistent. After Dinar et al. (2007), Molina et al. (2008) and Mariano et al. (2012) compared the performance of farmers under different technological regimes using stochastic frontier models but without considering corrections for selectivity bias.

In the literature (Table 1), the two-step Heckman (1979) sample selection model has been mainly applied to control for self-selection in stochastic production frontier studies. For example, Sipiläinen and Lansink (2005) analyzed the efficiency levels for organic and conventional farming using the Heckman sample selection model, while Bradford et al. (2001) analyzed the impact of the cost of new technologies related to health care using data from a large hospital. Solís et al. (2007), evaluates technical efficiency scores for hillside farmers in Central America using a switching regression model to correct for selectivity bias. While the studies of Kaparakis et al. (1994), in an analysis for commercial banks, and Collins and Harris (2005), in a study for UK chemical plants, refer to the potential sample selectivity bias, they did not correct for it.

In addition, Kumbhakar et al. (2009) applied a modified one-step Heckman sample selection process to correct for selectivity bias using simulated likelihood estimation of the stochastic production frontier. However, this model involves high-dimension integration for the evaluation of the likelihood function. Recently, Lai and Tsay (2016) proposed an extention of Heckman (1979) sample selection model that accounts for fixed and random effects when a panel sample selection model needs to be estimated. Mayen et al. (2010) used propensity matching score techniques to address the self-selection while they compared organic and conventional farmers.

While Heckman’s model has been used to a great extent to correct selectivity bias in the stochastic frontier framework, Greene (2010) states that Heckman’s approach “is not appropriate for nonlinear models such as stochastic frontier model” (p. 15). Bravo-Ureta et al. (2012) attempted to evaluate the impact of agricultural extension on the performance of farmers in Honduras by taking into consideration the sample selectivity bias issue using propensity matching score techniques to capture biases stemming from observed variables (i.e. farm size, agroecological conditions), and selectivity correction model by Greene (2010) to correct for biases from unobserved variables, like farmers’ managerial abilities.

González-Flores et al. (2014) also implemented the method introduced by Greene (2010) using data for small-scale potato producers from Ecuador to explore the impact of a project on farmers’ efficiency. Although they found that there is no evidence of selectivity bias in the model, beneficiaries exhibit higher yields and lower technical efficiency scores compared to control farmers. Finally, Villano et al. (2015) adopted the framework introduced by Greene (2010) to measure the impact of modern rice technologies in the Philippines controlling for self-selectivity biases. Using a cross-sectional data of 3,164 farmers, they found that the adoption of certified rice seeds has a significantly positive impact on farmers’ efficiency levels.

Table 1 Related literature

We assess the impact of adopting an alternative irrigation technology by addressing the sample selectivity issue in the framework of stochastic frontier production model using the one-stage Greene (2010) framework and the three-step endogenous switching regression model. Our empirical application uses data covering the agricultural sector in Crete (Greece) for the cropping period from 2009–2010 to 2012–2013. The sample consists of 56 vegetable greenhouse producers and provides detailed information on production patterns, input use (including water usage), gross revenue, farmers’ and environmental characteristics, and structural factors such as subsidies and on-farm extension visits. This study is one of the first to use the corrected for selectivity bias stochastic production frontier framework to assess the impact from the adoption of a new irrigation technology.

3 Conceptual Framework

To evaluate the impact of adopting a new irrigation technology on the technical efficiency of farmers, we apply the stochastic frontier model of Aigner et al. (1977) corrected for sample selection bias. Stochastic production frontier models have been used extensively to measure farm-level efficiency. However, most of the studies using stochastic production frontier models have failed to correct for self-selection biases when comparison of technical efficiency scores between adopters and non-adopters is conducted. Possible selection bias can be taken into account by applying the three-step endogenous switching regression model or the one-step Greene (2010) approach. Consistent with the stochastic frontier production framework, there is a correlation between the unobserved variables (e.g. related to the abilities of manager) in the selection equation with the noise in the stochastic frontier model.

We apply a three-stage switching regression model to capture heterogeneity in farmer’s decision to adopt or not the overhead sprinkler irrigation. Endogenous switching regressions have been used in the literature to capture heterogeneity related to technology adoption (Solís et al. 2007; Falco et al. 2011; Wollni and Brümmer 2012). The three stage endogenous switching regression uses a Probit model in the first stage to determine factors affecting the decision to adopt the new irrigation technology, while in the second stage, separate regression equations are used to model the production behavior of adopters and non-adopters. Finally, in the third stage, separate stochastic production functions are estimated using variables from the second stage that takes into account the selection bias (W). The coefficient of the W variable in the third stage captures the existence or not of selectivity bias in the model.

The first stage probit model can be expressed as:

$$\begin{aligned} d_{i}=I[a^{'}*z_{i}+w_{i}>0] \end{aligned}$$
(1)

where

I(.): indicator function denoting the irrigation technology

\(z_{i}\): vector of factors that affect the farm’s choice of production technology

\(\alpha\): parameters to be estimated

\(d_{i}=1\) (if greenhouse producers adopt the new irrigation technology), 0 (otherwise) and \(w_{i}\sim\)N[0,1]. Typically, the adoption decision is a binary variable taking the value of 1 for adopters, and the value of zero for non-adopters. For the estimation of the sample selection model, the Probit model framework is used (Feder et al. 1985; Foltz 2003; Kim et al. 2005). In the case that the adoption decision process occurs sequentially and multiple technologies are adopted, count regression models can be applied for the estimation of the model (Ramírez and Shultz 2000).

The second stage production function models are specified as:

$$\begin{aligned}&y_{1i}=\beta _{1}^{'}*X_{1ki}+\epsilon _{1i}\nonumber \\&\quad \text {if}\ d_{i}=1{:}\ \text {adopt new irrigation technology} \end{aligned}$$
(2)
$$\begin{aligned}&y_{2i}=\beta _{2}^{'}*X_{2ki}+\epsilon _{2i}\nonumber \\&\quad \text {if}\ d_{i}=0{:}\ \text {not adopt new irrigation technology} \end{aligned}$$
(3)

The first and second stage of the switching regression model are performed simulataneoulsy, while the estimated coefficients of the selection bias variable W from the second stage are added in the stochastic production function as follows:

$$\begin{aligned}&y_{1i}=\beta _{1}^{'}*X_{1ki}+\theta *W_{1i}+v_{1i}-u_{1i}\nonumber \\&\quad \text {if}\ d_{i}=1{:}\ \text {adopt new irrigation technology} \end{aligned}$$
(4)
$$\begin{aligned}&y_{2i}=\beta _{2}^{'}*X_{2ki}+\theta *W_{2i}+v_{2i}-u_{2i}\nonumber \\&\quad \text {if}\ d_{i}=0{:}\ \text {not adopt new irrigation technology} \end{aligned}$$
(5)
$$\begin{aligned}&v_{ji}\sim N[0,\sigma _{jv}^2], \forall j=1,2 \end{aligned}$$
(6)
$$\begin{aligned}&u_{ji}\sim N[\mu _{ji},\sigma _{jiu}^2], \forall j=1,2 \end{aligned}$$
(7)

where

\(y_{i}\): the natural logarithm of the observed output produced by \(i^{th}\) farm,

\(X_{ki}\): the natural logarithm of the \(k^{th}\) input used by the \(i^{th}\) farm; including land, labor, seeds, water and intermediate inputs for \(k=1,...,5\)

\(W_{1i}\): variable that captures selectivity bias,

\(\beta , \theta\): unknown parameters to be estimated, \(v_{i}\): a two-sided, symmetric and normally distributed error term with \(E(v_{i})=0\) and \(Var(v_{i})=\sigma _{v_{i}}^2\) representing those factors that cannot be controlled by the farmers (such as weather conditions, conflicts in labor market etc) or omitted explanatory variables,

\(u_{i}\): a one-sided, non-negative error term with \(E(u_{i})=\mu _{i}\) and \(Var(u_{i})=\sigma _{u_{i}}^2\) associated with the short fall of the produced output from the production frontier, due to technical inefficiency, and can be explained by a set of managerial variables.The one-sided disturbance reflects the fact that each farm’s production should lie on or below the production frontier and represents factors under the farmer’s control. Finally, it is assumed that \(v_{i}\) and \(u_{i}\) are distributed independently of each other and of the variables in the specification of the production frontier.

Greene (2010) proposed an internally consistent one-step approach for selectivity corrected stochastic production frontier model that is consistent with the non-linear nature of the stochastic frontier specification. The stochastic production frontier model in the sample selection framework can be expressed with the following five equations:

$$\begin{aligned} d_{i}&=1[a^{'}*z_{i}+w_{i}>0] \end{aligned}$$
(8)
$$\begin{aligned} y_{i}&=\beta ^{'}*X_{ki}+v_{i}-u_{i} \end{aligned}$$
(9)
$$\begin{aligned} u_{i}&=|{\sigma _{u}U_{i}}| \end{aligned}$$
(10)
$$\begin{aligned} v_{i}&=\sigma _{v}V_{i} \end{aligned}$$
(11)
$$\begin{aligned} (w_{i},v{i})&\sim N_{2}[(0,1),(1,\rho \sigma _{v},\sigma {v}^{2})] \end{aligned}$$
(12)

where \(w_{i}\sim\)N[0,1], \(U_{i}\sim\)N[0,1], and \(V_{i}\sim\)N[0,1].

The parameter \(\rho\) in Eq. 12 captures the correlation between the error term w in the sample selection equation (Eq. 8) and the error term v in the stochastic production frontier model (Eq. 9). If we find that the unobservables in the selection equation are correlated with the noise in the stochastic frontier model, which implies that \(\rho\) is statistically different from zero, then the existence of sample selection bias can be verified for the model.

4 Data Description

In this study, we use the dataset found in Chatzimichael et al. (2015), with some adjustments for the sample selection. The data set consists of 56 randomly selected vegetable greenhouse producers in the Ierapetra Valley in Southern Crete, Greece, during four cropping years from 2009–2010 to 2012–2013. This results in a balanced panel data set of 256 observations. The randomly selected sample of greenhouse producers chosen for this study is part of a survey that attempted to examine the impact of adopting a modern irrigation technology on the effectiveness of irrigation water application from greenhouse farms in the Ierapetra Valley in Crete. The Agricultural Department of the Regional Directorate in Crete financed this survey to address water limitations that the semi-arid area of the Mediterranean basin faced the last decade.

The data were collected by extension agents from the Agricultural Department of the Regional Directorate interviewing the selected greenhouse producers using the same questionnaire. The interviews took place at the beginning of June of the last cropping year (2012–2013). The producers had to recall information about the exact time of adopting the new irrigation technology and other variables related to the production process of their greenhouses (i.e. output produced, labor and land used, water usage etc.) for the last four cropping years. The usual cropping year for greenhouses producing vegetables in Crete starts between the middle of August to the beginning of September and lasts till the end of May.

Nearly 80% of the farms in this valley are using water from the local public irrigation system, which is connected to a small dam in central Crete island (Bramianos dam). Farms using their own wells for irrigation purposes were excluded from this survey. Most of the greenhouses before 2005 were using drip irrigation systems, where multiple tubes are attached to the central water supply line and each of these tubes supplies one row of vegetable plants. After 2005, greenhouse producers were able to adopt an alternative irrigation technology, the overhead sprinklers. In this case, the greenhouse has a set of overhead water pipes that emit a mist though a number of sprinkler heads. In general, overhead sprinklers systems in greenhouses are controlled automatically by moisture censors and timers, and considered to be more water effective and to minimize evapotranspiration (ET) comparing to the traditional drip irrigation systems. Based on this information, the adoption of the overhead sprinkler systems in the greenhouses of this semi-arid area of the Mediterranean basin can ease the water stress in the greenhouse production.

Although the data set presents four years of observations for 56 farmers creating a balanced panel data for adopters and non-adopters, this version of the model is applied as a cross-sectional data set. Farmers could adopt the new irrigation technology at any of the four cropping years of interest in this study. Possible differences stemming from the year of adoption can be captured via cropping year dummy variables in the production frontier model. Pooling the data across cropping years can be justified by that both irrigation technologies are available to the farmers in all the cropping years. Also, Greene (2010)’s application uses panel data for OECD and non-OECD countries but the model is formulated to accommodate cross sectional data.

Descriptions of the variables used in this study are presented in Table 2. The output of each farm is the total revenue of farm output measured in euros and it is a weighted output index that aggregates the four different vegetables produced in the greenhouses (tomatoes, cucumbers, peppers and eggplants). The inputs include labor (family and hired) measured in hours, seeds measured in euros, land measured in stremmas (1 stremma equals 0.1 ha), water measured in \(m^{3}\), and a weighted intermediate input index that aggregates other goods and materials used during the cropping year like pesticides, fertilizers and electricity, and it is measured in euros. All monetary variables were converted into 2010 constant prices.

Table 2 Variables used in the Probit and stochastic production frontier model

As the goal of this model is to correct the stochastic production frontier model for self-selectivity biases, the variables that are used in the sample selection (Probit) equation are: linear and quadratic terms of farmer’s age and education, number of extension visits, subsidies received by the farmer (measured in euros), farm size (measured in stremmas), an aridity index (defined as the ratio of the average annual temperature to the total annual precipitation in the region) that captures climate changes, and soil water holding capacity measured in cm per second. In particular, soil water holding capacity is a measure of soil’s hydraulic conductivity and can vary across greenhouses based on the area that greenhouse farmers choose to cultivate. The Ierapetra Valley has sandy soils mainly which vary from pure sandy soil to sandy clay soil.

Table 3 lists the summary statistics for the variables used in this study. As shown in this table, we can observe that 23 out of 56 greenhouse farmers in the sample did not adopt the new irrigation technology while the rest of them (33 farmers) adopted the sprinkler irrigation at any of the 4 cropping years of interest. The data also indicate that there are statistically significant differences between the means of the variables for adopters and non-adopters.

Table 3 Summary statistics for all cropping years

Finally, Table 4 presents the summary statistics of the greenhouse producers classified by geographical area, and more specifically by village. In this study, we have 8 different villages in the Ierapetra Valley in Crete, Greece. For each village, we have the information regarding the number of greenhouse farmers who was interviewed and also how many of them decided to adopt the sprinkler irrigation technology. With the exception of the producers in Village 1, in all the other villages more than 50% of the farmers decided to adopt the new irrigation technology. The map that reveals the location of each village is presented in Figure 1. Spatial dummy variables are used in the production frontier model to capture differences between adopters and non-adopters due to geographical location.

Table 4 Summary statistics organized by location for all the cropping years
Fig. 1
figure 1

Map-Ierapetra Valley, Crete, Greece

5 Empirical Results and Analysis

5.1 Factors of Sprinkler Irrigation Technology Adoption

To find whether or not sample selectivity can be an issue, we estimate stochastic production frontier correcting for sample selection that may stem from the choice to adopt sprinklers. We assume that farmers adopt the new irrigation technology with the expectation to maximize their profits, and that the decision to adopt or not is based on a set of socio-demographic structural and environmental factors. The sample selection model that is used can be expressed as:

$$\begin{aligned} adopt_{i}=\alpha _{0}+\sum _{j=1}^{8} z_{ji}+w_{i} \end{aligned}$$
(13)

First, we estimate the Probit model (Eq. 13) where the independent variable is a dichotomous variable taking the value of 1 if the farmer adopts the new irrigation technology, or 0 otherwise. The explanatory variables (\(z_{i}\)) have been selected to capture all those factors that affect farmers’ decisions to adopt sprinkler irrigation (Table 2).

Many studies have examined the impact of different factors on the adoption of modern agricultural technologies (Feder et al. 1985; Bravo-Ureta et al. 2012; González-Flores et al. 2014; Villano et al. 2015). Following equation (13), Table 5 displays the estimated coefficients of the Probit sample selection model and its marginal effects. The chi-squared test statistic shows joint significance of the parameters in the sample selection model at the 10% level. The majority of the estimated parameters are statistically different from zero. Among the variables describing farmer’s characteristics, age and education play a significant role on farmers’ decision to adopt sprinkler irrigation technology. Farmers who are older and better educated are more able to search and adopt modern agricultural technologies comparing to younger and less educated farmers (Mariano et al. 2012).

Subsidies are expected to encourage farmers to adopt new agricultural technologies, and our results confirm these expectations. Subsidies are positively correlated with the probability of adopting the new irrigation technology due to the fact that financial aid received can reduce the installation costs of overhead sprinklers. In addition, the estimated coefficient for the variable extension visits is also positive and statistically significant implying that farmers who have access to extension services are more likely to adopt sprinklers. This finding is consistent with the literature highlighting the important role of extension in the adoption of modern agricultural technologies, including irrigation practices (Jara-Rojas et al. 2012). Farm size has a positive impact on irrigation technology adoption but not statistically significant. This result is consistent with the literature suggesting that farmers who own larger cultivated areas are more risk averse and thus will be more likely to adopt new agricultural technologies comparing to those who own smaller farms (Feder and Umali 1993).

Adoption of new irrigation practices is also affected by environmental factors apart from the demographic characteristics of the farmers. Water holding capacity, which is a proxy for the hydraulic conductivity of water in soil, is positively related to sprinkler adoption at the 10% level, implying that producers who own farms with soil exhibiting greater water holding capacity for crop use are more willing to adopt the sprinkler irrigation technology. In addition, the aridity index, which is a proxy of weather variability, has a positive but not statistically significant effect on the adoption of sprinklers. Given the definition of aridity index (ratio of the average temperature to the total precipitation), our results are consistent with the notion that higher values of aridity index lead to difficulties in irrigation and therefore farmers will be more likely to adopt a new irrigation technology that can help them to overcome issues related to changes in weather patterns.

Table 5 Parameter Estimates of the Probit Model

Table 5 also presents the marginal impact of each variable along with their standard errors. The number of extension visits on-site have the highest effect on the adoption of sprinklers following by the age of the farmer. More specifically, 20% increase in the number of extension visits results in a 2.58% increase in the probability of sprinkler irrigation adoption. This result indicates that one more on-site extension visit will increase the probability of overhead sprinkler adoption by 2.6%, which supports Solís et al. (2007) ’s remark that investments in human capital provide the greatest returns in terms of socioeconomic development through the highest probability of adopting a modern agricultural technology. Farmer’s age has also a statistically significant impact on the probability of sprinkler adoption which diminishes by the negative coefficient of the corresponding squared term. This decreasing effect implies that for older farmers, any additional year will lead only to small changes in the information and experience acquisition related to the adoption of new irrigation practices.

5.2 Stochastic Production Frontier Estimates

Table 6 presents the estimates of the structural production model parameters and the parameters associated with the selectivity bias. The seven different models estimated in this study are presented below. In Model 1, all the available data are used to estimate a pooled stochastic production function with no correction for selectivity bias. Three separate models are estimated using data for those who adopted overhead sprinklers. More specifically, Model 2 ignores selectivity bias, while Models 3 and 4 are correcting for it using endogenous switching regression and one-stage Greene model specifications, respectively. In the last three columns of Table 6, three stochastic production frontier models are estimated using the non-adopters group data. Model 5 is estimated without correction for selectivity bias, while Models 7 and 8 incorporate corrections for sample selection bias based on the switching regression and Greene’s framework, respectively.

The production frontier is specified as a Cobb-Douglas functional formFootnote 1 normalized with respect to the intermediate inputs variable as follows:

  • Stochastic production frontier function for switching regression’s and Greene’s specification:

    $$\begin{aligned} lny_{i}= \beta _{0}+\sum _{n=1}^{4}\beta _{n}lnx_{ni} +\sum _{k=1}^{2}\delta _{1k}d_{1ki}+\sum _{l=1}^{3}\delta _{2l}d_{2li}+v_{i}-u_{i} \end{aligned}$$
    (14)

where \(y_{i}\) is the output; \(x_{i}\) represents inputs; \(d_{1}\) and \(d_{2}\) are location and time dummies respectively; \(\beta , \theta , \delta\) are the unknown parameters; u is the error term that captures the inefficiency; and v is a two-sided error term. The variables used in the estimation of the Probit model are different from the variables used in the stochastic production frontier models to meet the identification criterion. The models were estimated using NLOGIT Version 5.

As expected, all estimated input coefficients are statistically significant positive with their magnitudes changing across models, except for the seeds input for the Model 3 that is not statistically significant. In all the models, cultivated land contributes the most to the greenhouse production which is consistent with the notion that land availability is the main production constraint for small-scale farms. All the models, except for those corrected for selectivity bias using Greene’s specification, present (marginally) increasing returns to scale, which implies that the output produced increases by a (marginally) larger proportion than the increase in inputs during the production process. The returns of scale are measured by the elasticity of scale, which is the sum of all the output elasticities of the estimated Cobb-Douglas function. For irrigation water applied, measured in \(m^{3}\), the average point estimates range from 0.068 to 0.236 with a mean of 0.152, which is consistent with the production of vegetables. The parameter estimates for the location dummies, when statistically significant, indicate that specific location can have a positive contribution to greenhouse production. For details on the construction and description of location dummies, see Table 2.

Log-Likelihood ratio (LR) tests reveal that separate estimation of technologies for adopters and non-adopters is appropriate in this data set. The estimated LR test is given by:

$$\begin{aligned} LR=-2*(LR_{p}-(LR_{a}+LR_{na})\sim \chi _{k}^{2} \end{aligned}$$
(15)

where \(LR_{p}\) is the log-likelihood function value for the pooled model, \(LR_{a}\) and \(LR_{na}\) represent the log-likelihood values for the adopters and non-adopters sub-sample groups correcting or not for selectivity bias, respectively. The LR tests reject the null hypothesis in all the three cases (no selection, switching regression and Greene models), which implies that the production frontier estimated parameters differ across adopters and non-adopters. This is also an indication that differences in the efficiency between the group of farmers who adopted sprinkler irrigation and the group of non-adopters can be estimated as we are correcting for selectivity bias.

The impact of correcting for selectivity bias using the different model specifications is presented below. The estimated coefficients of the second stage of the endogenous switching regression models of the adopters and non-adopters production functions are suppressed and not shown here. The results of the third stage estimation of the stochastic production frontiers adjusted for self-selection are reported in Table 6. The coefficient on the selectivity variable (W) from the switching regression model is not statistically significant different from zero in both production functions, establishing that no evidence of selection bias can be detected in this study when the switching regression model is applied.

However, Greene’s (2010) sample-selection correction method finds the coefficient \(\rho\) of the selectivity variable is statistically significant from zero for the group of adopters, suggesting the presence of selection bias. Thus, a selectivity corrected stochastic frontier model is needed to obtain unbiased frontier estimates which can be used for estimating the technical efficiency scores. The insight from this result can be that Greene’s method is more appropriate for non-linear models, like stochastic frontier models, when correction for selectivity bias is required.

Table 6 MLE estimation results: Stochastic Frontier model

5.3 Technical Efficiency Estimates

Table 7 presents the descriptive statistics for the technical efficiency scores across alternative model specifications. These results are within the range reported by Bravo-Ureta et al. (2007) in their meta-regression analysis for farm-level technical efficiency studies. The authors present that the average technical efficiency score for stochastic production frontier studies in Europe is 82.4%. For the models where there is no correction for selectivity bias, average technical efficiency is higher for those who adopted the new irrigation technology comparing to those who did not. However, when the Greene’s correction method is used, the non-adopters present higher technical efficiency scores than the adopters.

Table 7 Descriptive statistics of Technical Efficiency (TE) scores

In Table 7, adopters present an average technical efficiency ranging from 76.8% (TE-A-G) to 84.7% (TE-A), while the average technical efficiency score for non-adopters changes slightly across the no correction and Greene (2010) models with an average of 82%, except for the switching regression model where a mean TE of 85.5% has been found for the non-adopters. Moreover, adopters exhibit a higher mean technical efficiency only for the case of no selection bias correction, while the models that are corrected for selectivity bias using switching regression model and Greene’s specification show higher mean technical efficiency for the group of the non-adoptees. The coefficient of variation in technical efficiency scores increases from 0.098 to 0.164 for adopters and from 0.121 to 0.129 for non-adopters.

We also find that correcting for selectivity bias affects the distribution of technical efficiency estimates. In the presence of selection bias, the correlation coefficient (\(\rho\)) between \(v_{i}\) and \(w_{i}\) is statistically significant affecting the variances of the two error terms (\(\sigma _{w}^{2}\) and \(\sigma _{v}^{2}\)) and consequently the estimated efficiency scores and their variance (\(\sigma _{u}^{2}\)), as the estimated TE scores depend strongly on the ratio of the variance of the inefficiency term (\(\sigma _{u}^{2}\)) and the variance of the noise term (\(\sigma _{v}^{2}\)). The variance of the TE scores directly affects the distribution of the TE scores as the variance is related to the dispersion of the estimated TE scores. This may explain the difference in the distributions of the technical efficiency scores for the adopters between the models in Figure 2. In the case of statistically significant evidence of selection bias for the adopters (Greene model), the distribution of the technical efficiency scores is less skewed to the left compared to the distribution of technical efficiency scores from the switching regression model (that did not detect any selection bias) or from the model with no selection bias correction. The dispersion of the technical efficiency scores is higher when Greene’s sample-corrected specification is used for both adopters (Figure 2) and non-adopters (Figure 3).

Fig. 2
figure 2

Technical Efficiency scores for adopters

Fig. 3
figure 3

Technical Efficiency scores for non-adopters

In Table 8, the technical efficiency mean differences between adopters and non-adopters for all the model specifications are presented. We find that there is a statistically significant technical efficiency mean difference between the group of adopters and non-adopters for the conventional model with no correction for selectivity bias and for the model using Greene’s specification to correct for sample selection biases. Table 9 and Figure 4 show the technical efficiency scores for all the models across the 4-year time frame of this study. The adopters experience higher TE than the non-adopters only when selectivity bias is not corrected. When the one-step approach of Greene is used to correct for selectivity bias, the adopters exhibit lower TE than the non-adopters. However, the TE of the adopters is increasing over time, while the TE of the non-adopters remain the same over time. This can be explained by the fact, as with every new technology at the beginning of the adoption, the farmers experience lower TE scores and over time their performance increases (learning by doing). Overhead irrigation technology is an investment that can show its effect in the long run, as it requires for farmers to acquire the appropriate knowledge and experience on how to use properly a new, advanced, based on sensors irrigation technology.

Table 8 Technical Efficiency mean differences across models
Fig. 4
figure 4

Technical efficiency scores over time

The analysis of technical efficiency scores for the different model specifications in this study reveals three main findings. First, it shows the need to correct for selectivity bias in the stochastic frontier models because without the correction there is a magnification in measuring adopters’ efficiency in this case study. Second, the gap in farmer’s performance between adopters and non-adopters is underestimated when stochastic frontier models without correction for selectivity bias are applied. These two findings may be closely related as the underestimation in the technical efficiency difference between adopters and non-adopters may be driven by the magnification in the TE estimates of adopters without correction for selectivity bias. These findings are also consistent with those from other studies measuring farmer’s performance after the adoption of new agricultural technologies (Villano et al. 2015). Finally, in the case of the targeted technology intervention, it is likely that farmers who are less productive are those adopting the technologies, and failure to account for this may understate the true impact of the technology.

Table 9 Technical efficiency mean differences across models, 2009-2012

6 Concluding Remarks

The impact evaluation literature of agricultural projects mostly relies on randomized control trial frameworks, where baseline information before the project implementation and randomization in the selection process of the project participants are needed. However, in many cases, agricultural projects are implemented without following the design of a randomized control trial (as it can be a time and money consuming process), while at the end the evaluation of the project’s impact may be still required for policymaking purposes. This is where the sample selection bias can arise and the impact evaluation of a project may lead to biased results. While Heckman’s model has been widely used among other methodologies to correct for selectivity bias in the stochastic frontier framework, Greene (2010) stated that Heckman (1979)’s approach may not be appropriate for nonlinear models such as stochastic frontier model and he proposed an internally consistent one-step approach for selectivity corrected stochastic production frontier model.

This study integrates the stochastic production framework with the impact evaluation theory in the agricultural sector using technical efficiency as indicator of farmer’s managerial performance in different technological regimes. Greene’s (2010) specification along with the endogenous switching regression model are used to deal with possible selectivity biases stemming from farmers’ unobservable characteristics in the stochastic production frontier model. The analysis is carried out using a data set of 56 greenhouse farms producing vegetables in Greece during four cropping years, from 2009-2010 to 2012-2013, converting from drip to overhead sprinklers irrigation.

The analysis presented in this study shows that selection bias is present in the group of adopters when Greene’s (2010) model is used providing justification for the corrected for selectivity bias stochastic frontier model. Also, adopters’ technical efficiency scores are lower than those of the non-adopters when the corrected for self-selection stochastic frontier model is applied. This implies that in the short-term the adoption of the new irrigation technology has a cost in terms of technical efficiency. However, this outcome is consistent with the fact that when farmers are facing a production change, such the adoption of a new agricultural technology, initially this can lead to a decrease in their technical efficiency and productivity (Sipiläinen and Lansink 2005). When farmers are introduced to a new technology, learning-by-doing and the experience accumulated over time can affect farmer’s efficiency on a later stage. In addition, the adoption of new technologies can lead to less motivated farmers and this can be represented by lower technical efficiency scores. Thus, a methodological framework that accommodates the panel feature of the data set can reveal whether technical efficiency starts to increase the longer farmers are using the new agricultural technology.

Also, statistically significant differences between the technical efficiency scores of adopters and non-adopters have been found in this study. The magnitude of this difference is increasing when the stochastic production frontier model is corrected for selectivity bias using Greene’s (2010) approach. Finally, our results suggest the important role of on-farm extension visits on the adoption of new agricultural technologies. This implies that policies targeting the adoption of new technologies in the agricultural sector should focus more on providing farmers with on-farm visits by extension personnel compared to policies that mainly revolve around the provision of economic incentives, such as subsidies. However, under the current weather variability that is observed in arid regions as Greece, new technologies that promote water sustainability might be promoted in combination with support to the farmers to assist them with the lower performance in the first years after technology adoption.