1 Introduction

Weersink et al. (2002) argue that optimal resource allocation is important not only because of its effects on farm income but also because of its environmental impact. Non-point source (NPS) pollution stemming from agricultural practices is seen as a major cause of the remaining water-quality problems in developed and developing countries (Shortle et al. 1998; Rossouw and Görgens 2005; Ranga Prabodanie et al. 2010; Li et al. 2014a, b). Consequently, there is increased pressure on agriculture to use resources optimally in order to reduce the negative environmental effect caused by agricultural practices (Shortle et al. 2001). In the absence of a market for reduced environmental emissions, the information generated with trade-off analysis will be critical for informed policy decision making, as it allows policy makers and the public to assess whether a given improvement in environmental quality is worth the sacrifice in agricultural production (Stoorvogel et al. 2004).

Generating economic-environmental trade-off curves is a complicated endeavor and requires quantifying the inter-relationships between sustainability indicators implied by the underlying biophysical processes and producers’ economic behavior (Ranga Prabodanie et al. 2010). Alternative abatement strategies and/or policy instruments are compared on the basis of the alternative that achieves an environmental goal with the least impact on the economic indicator. A complicating factor is that environmental emissions are inherently stochastic as a result of a variety of environmental conditions (Horan 2001; Kampas and White 2004; Kataria et al. 2010). Consequently, pollution-control strategies should be aimed at improving the distribution of outcomes rather than some scalar value (McSweeny and Shortle 1990). By implication, these control strategies will achieve environmental goals with only a measure of certainty.

A modeling alternative to incorporate the variability of environmental outcomes while quantifying economic-environmental trade-offs is chance-constrained programming (CCP) (Li et al. 2014b; Kataria et al. 2010; Kampas and White 2003). The application of CCP requires the specification of a functional form for the distribution of the environmental variable (Qiu et al. 2001). Various researchers have shown that the distributional assumptions employed in CCP models have a significant impact on the estimated trade-offs (Zhu et al. 1994; Qiu et al. 2001; Kampas and White 2003; Kataria et al. 2010) and may not hold for all situations as a result of the site-specific nature of agricultural NPS pollution (Wang et al. 2016; Qiu et al. 2001). To overcome the problem, techniques like the Environmental Target-MOTAD model (Teague et al. 1995) were developed to estimate economic-environmental trade-offs while making use of empirical distributions. Qiu et al. (1998) scrutinized the use of the Environmental Target-MOTAD model and argued that it would be difficult to apply because the scientific basis for the selection of a reasonable environmental risk level is weak. As an alternative, these researchers developed the upper partial moment (UPM) stochastic inequality that provides a stronger scientific basis for modeling economic-environmental trade-offs because the environmental risk level is given by the compliance probability.

A potential problem with the application of the UPM model (Qiu et al. 2001) in enforcing a probabilistic constraint is the fact that the actual compliance probability is larger than the specified compliance probability. Even though specified compliance levels may be equal across alternatives, the actual compliance and the optimal management responses may differ significantly between alternatives. These differences raise questions about the fairness with which alternatives are compared. Some researchers (Atwood et al. 1988; Qiu et al. 2001) have raised concerns about the conservativenessFootnote 1 of the UPM, although neither of these researchers has investigated the severity of the conservativeness.

The main objective of the article was to present an alternative method to enforce a probabilistic constraint with a probability bound close to the actual compliance probability, which will result in a less biased comparison between alternatives. The method is applied to demonstrate that the UPM model is very conservative in the estimation of the upper probability bound, which results in an overestimation of abatement costs and an underestimation of the average amount of pollution above the environmental goal.

The newly developed upper frequency method (UFM) counts the number of states with deviations above the environmental goal in an effort to ensure that the deviations above the goal do not exceed the number of deviations allowed by the model. Like the UPM, the UFM uses an empirical distribution of the environmental outcome to enforce the probabilistic constraint, which overcomes the need to specify the statistical distribution of the outcome variable. The generality of the method makes it applicable to any situation where the risk of exceeding a specified target level is of concern.

2 Conservativeness of the upper partial moment

Safety-first rules are concerned with the probability of a variable falling above or below a critical or target level. Probabilistic safety-first constraints can be imposed using different chance-constraint bounds such as the distribution-free Chebyshev stochastic inequality. Imposing the probabilistic constraints through the use of Chebyshev’s inequality generates strongly conservative probability bounds (Atwood et al. 1988). Realizing the need for a tighter probability bound Berck and Hihn (1982) introduced a semi-variance inequality that is able to generate a tighter upper probability bound compared to the Chebyshev. The semi-variance inequality follows Markowitz (1970) in that the mean-semivariance is a more attractive measure of risk than the mean–variance approach of the Chebyshev. Atwood (1985) extended Berck and Hihn’s (1982) semi-variance inequality with a more general lower partial moment stochastic inequality to enforce constraints with a smaller upper probability limit than the Chebyshev and the semi-variance inequality. Although the probability bound of the UPM method is tighter than the Chebyshev inequality, the bound is still conservative (Atwood et al. 1988; Qiu et al. 2001).

The probabilistic constraint of achieving a specified environmental goal is defined as follows using the UPMFootnote 2:

$$Pr\left[ {x \ge t + p\theta \left( t \right)} \right] \le \theta \left( t \right)/\left( {g - t} \right) \le \left( {1/p} \right)$$
(1)

where \(x\) is the pollution variable, \(t\) is a reference pollution level, \(g\) is the environmental goal, \(\uptheta\left( t \right)\) is the UPM measured as absolute deviation above \(t\), and \(p = \left( {\frac{1}{1 - cp}} \right)\) and \(cp\) are the compliance probability.

Figure 1 is used to explain the application of Eq. 1 and the origins of the overestimation of the actual compliance probability when using the UPM to enforce the probabilistic constraint. The stylized example that was developed portrays a situation where the environmental goal, \(g\), must be maintained at least 75% of the time. The dotted line represents the cumulative probability distribution of \(x\). Enforcing the probabilistic constraint within an optimization framework requires that a reference pollution level, \(t\), be determined during the optimization so that the UPM, \(\theta \left( t \right)\), expressed as a portion of the difference between \(g\) and \(t\), is equal to \(1 - cp\). Graphically the difference between \(g\) and \(t\) is represented by the summation of the areas labeled from 1 to 4, which are equal in size. The shaded area indicating \(\theta \left( t \right)\) extends beyond \(g\). However, the area of the shaded triangle that goes beyond \(g\) is exactly the same size as the area of block 1 that is not shaded. Therefore, \(\theta \left( t \right)\) is equivalent to the area of block 1. Thus, \(\frac{{\uptheta\left( t \right)}}{g - t}\) is 25%, even though some pollution levels above \(g\) are possible. Specifying a value of \(p = 4\) will ensure that the proportion is 25% because \(t + p\theta \left( t \right) = g\). As a result, \(t\) will be achieved with the specified \(cp\) while \(g\) will be achieved with a higher \(cp\), which gives rise to the overestimation of the actual compliance probability when using the UPM inequality to enforce probabilistic constraints.

Fig. 1
figure 1

A stylized graphical illustration of the upper partial moment (UPM) and the upper frequency method (UFM)

The only known input parameters to the optimization problem are \(g\), \(cp\), and, therefore, \(p\). The distribution of nitrate losses is conditional on the choice of production practices that will maximize producers’ profit margin, given that nitrate losses are no more than \(g\), \(1 - cp\) percent of the time. The choice of \(t\) and therefore the size of \(\theta \left( t \right)\) are significantly affected by the endogenously determined distribution of nitrate losses. Thus, there is no chance of predicting the actual probability that \(g\) will be achieved, apart from knowing the bound will be tighter than \(cp\) with which \(t\) is satisfied.

From an environmental point of view, a tighter probability bound is beneficial. However, from a polluter’s point of view, a tighter bound implies overregulation, which may cause considerable loss of profits. The only way to compare alternatives for reducing environmental pollution correctly is to compare alternatives with methods that will generate small differences between specified- and actual \(cp\).

The dashed line represents the distribution of nitrate losses that will achieve \(g\) at the given \(cp\). Such an environmental outcome could be achieved by determining states of nature with deviations above \(g\) and then restricting the number of states to 25% of the number of total states of nature. Teague et al. (1995) have demonstrated that states with deviations above \(g\) could easily be identified using an Environmental Target-MOTAD framework.

Several indicators could be used to determine the conservativeness of the UPM. The most obvious indicator is to compare the specified compliance probability that is used in the UPM to the actual compliance probability as an indicator of the conservativeness of the compliance probability estimate. The UFM allows for at least two new measures to determine the conservativeness of the UPM. Firstly, the difference between the average pollution levels above the environmental goal for the UPM and UFMFootnote 3 could be compared for obtaining an indication of the environmental impact. Secondly, the cost to the polluter could be estimated by comparing the objective function values of the UPM and UFM to determine the impact on the polluter.

3 Data and procedures

3.1 Data simulation

Crop growth modeling provides a powerful means of generating yield response and environmental indicators for alternative management practices when field measurements are lacking (Weersink et al. 2004; Samarawickrema and Belcher 2005). Quasi-experimental data on yield response and nitrate losses were simulated with a mechanistic, generic crop growth model originally developed for irrigation scheduling (Annandale et al. 1999). The Soil Water Balance (SWB) model was extended by Van der Laan (2009) through the addition of nitrogen and phosphorus simulation routines and algorithms to simulate above-ground nitrogen mass, grain nitrogen mass, soil water content and the fate of nitrogen. Van der Laan (2009) tested and validated SWB using historical datasets collected in the Netherlands, Kenya and South Africa.

The SWB model was used to simulate crop production and an environmental indicator consisting of nitrate losses (runoff and leaching) for the production of late monoculture maize (planting date 15 December) under irrigation on two soil types at Glen, South Africa. Maize production was simulated for a sandy clay loam (SCL) and sandy clay (SC) soil using 19 years of weather data while assuming an initial soil nitrogen level of 33 kg. Nine levels of fertilizer could be applied in either a single or a split application. When using split applications two-thirds of the desired nitrogen level were applied on the day of planting, while the remaining third was applied seven weeks later. Only applications above 70 kg/ha were applied in a split application.

3.2 Quantifying environmental risk

Unique production conditions during a specific production year cause nitrate loss response to increasing levels of fertilizer application rates to be different between production years. As a result the procedure that is adopted in this research deviates from the norm where a single response function is fitted using all the data points and risk is characterized as deviations from the fitted response function. Instead, our methodology estimates a response function for each production year. Any unexplained variability not captured by the year-specific response function is treated as the risk of not being able to predict nitrate loss as a function of nitrate application rates within a specific year exactly. Using all the year-specific stochastic nitrate loss response functions simultaneously will characterize the risk of not knowing which year will occur, as well as the risk of not being able to exactly predict nitrate loss in the circumstances that the resulting year is known. The benefit of estimating year-specific response functions is that the procedure automatically models the heteroscedasticity of nitrate losses embedded in the data.

Next, the procedure that was used to construct the empirical distribution of the environmental risk indicator is discussed in more detail. According to Richardson et al. (2000), the first step is to determine the non-random (predictable) component using regression analysis. The following equation was estimated for each production year using ordinary least squares (OLS):

$$\hat{E}_{s} \left( {N_{f} } \right) = e_{1s} + e_{2s} N_{f} + e_{3s} N_{f}^{2} + \tau_{sf}$$
(2)

where \(\hat{E}_{s} \left( {N_{f} } \right)\) represents the predicted nitrate losses in production year \(s\) as a function of the simulated nitrogen application rates (\(N_{f}\)) (kg/ha), \(e_{is}\) is the ith estimated coefficient for the nitrate loss function in year \(s,\) and \(\tau_{sf}\) is the estimation error for the regression of year \(s\) given nitrogen application rate \(f\). In total 19 different regression equations were estimated using the nitrate losses simulated for nine distinct fertilizer application rates (\(N_{f}\) = 20, 45, 70, 95, 120, 145, 170, 195, 220). The random component associated with nitrate loss response in each year is represented by the regression residual, which was calculated as:

$$\tau_{sf} = E_{sf} - \hat{E}_{s} \left( {N_{f} } \right)$$
(3)

where \(E_{sf}\) represents simulated nitrate losses in year \(s\) for nitrogen application rate \(f\). The empirical outcomes that characterize the variability of nitrate losses for any given level of nitrogen fertilizer application rate are calculated by combining the predictable and random components as follows:

$$\tilde{E}_{sf} \left( N \right) = \hat{E}_{s} \left( N \right) + \tau_{sf}$$
(4)

where \(\tilde{E}_{sf} \left( N \right)\) is the empirically distributed nitrate losses as a function of nitrogen application rate. Important to note is that \(\tilde{E}_{sf} \left( N \right)\) is a continuous function that is not restricted to the nine levels of \(N\) used during the simulation process. Equation (4) shows that the empirical distribution of nitrate loss is represented by outcomes for every production year (\(s\)) and the error associated with every simulated fertilizer application rate (\(f\)). Therefore, 171 (\(s \times f\)) outcomes characterize the risk of nitrate losses.

The nitrate loss response functions estimated using Eq 2 are presented in Appendix 2. Results for the response functions show that the nitrate losses are unique in every production year. During production year, S12, no relationship could be identified between nitrate losses and fertilizer use. Investigation of the data showed that no nitrate losses were simulated for the production year in question since no losses occurred as a result of a very dry production year. The bulk of the estimations explain a great deal of the variation in the simulated data with a good R2. However, not all of the estimations show a high R2, indicating that not all of the variation in nitrate losses is due to the amount of nitrogen fertilizer applied. A detailed discussion of the estimated response functions is available in Matthews (2014).

3.3 Gross margin estimation

In our application, modeling economic-environmental trade-offs requires a continuous function that relates average gross margins to any nitrogen application level. The use of continuous response functions overcomes the problem of input diversification. Use of discrete activities (non-continuous) for nitrogen application levels, gross margin and the nitrate loss levels could results in input diversification by the solution procedure, resulting in results that are near impossible to achieve in practice. The procedure that was used to construct the empirical distribution of nitrate losses was used to construct the variation in gross margins as a function of fertilizer application. The gross margin outcomes were then averaged to yield the economic indicator. Specifically, expected gross margins were estimated using the following equation:

$$\overline{GM} \left( N \right) = \mathop \sum \limits_{sf}^{{}} p_{sf} \left( {\tilde{Y}_{sf} \left( N \right)P_{Y} - NP_{N} - \tilde{W}_{sf} \left( N \right)P_{W} - C_{a} - C_{Y} \tilde{Y}_{sf} \left( N \right)} \right)$$
(5)

where \(\overline{GM}_{s} \left( N \right)\) is the expected gross margin as a function of applied nitrogen (ZAR/ha).Footnote 4 \(\tilde{Y}_{sf} \left( N \right)\) is the empirical distribution of crop yield (ton/ha) as a function of applied nitrogen (\(N)\), \(\tilde{W}_{sf} \left( N \right)\) is the empirical distribution of water applications (mm) as a function of applied nitrogen, \(N\) is the amount of nitrogen fertilizer (kg/ha) applied. \(P_{Y}\) is the price of maize (ZAR/ton), \(P_{N}\) is the price for nitrogen fertilizer (ZAR/kg), \(P_{W}\) is the cost of applying irrigation water (ZAR/mm). \(C_{A}\) is the area-dependent cultivation cost (ZAR/ha), \(C_{Y}\) is the yield-dependent harvesting cost (ZAR/ton), and \(p_{sf}\) is the probability that outcome \(sf\) will occur. \(p_{sf}\) is equal to \(\frac{1}{s \times f}\).

The empirical distributions of crop yield (\(\tilde{Y}_{sf} \left( N \right)\)) and applied irrigation water (\(\tilde{W}_{sf} \left( N \right)\)) were respectively calculated with Eqs 6 to 8 and Eqs 9 to 11.

$$\hat{Y}_{s} \left( {N_{f} } \right) = \beta_{1s} + \beta_{2s} N_{f} + \beta_{3s} N_{f}^{2} + \varepsilon_{sf}$$
(6)
$$\varepsilon_{sf} = Y_{sf} - \hat{Y}_{s} \left( {N_{f} } \right)$$
(7)
$$\tilde{Y}_{sf} \left( N \right) = \hat{Y}_{s} \left( N \right) + \varepsilon_{sf}$$
(8)
$$\hat{W}_{s} \left( {N_{f} } \right) = \omega_{1s} + \omega_{2s} N_{f} + \omega_{3s} N_{f}^{2} + \mu_{sf}$$
(9)
$$\mu_{sf} = W_{sf} - \hat{W}_{s} \left( {N_{f} } \right)$$
(10)
$$\tilde{W}_{sf} \left( N \right) = \hat{W}_{s} \left( N \right) + \mu_{sf}$$
(11)

\(\upbeta_{\text{is}}\) and \(\omega_{is}\) represent the ith OLS-estimated coefficients respectively for the yield response function and the irrigation water response function in the regression for year \(s,\) while \(\varepsilon_{sf}\) and \(\mu_{sf}\) represent the estimation errors of the yield response and irrigation water response functions respectively.

Account should be taken of the fact that crop yield was only estimated as a function of nitrogen applications and seemingly no relationship exists between water applications and crop yield. No relationship was modeled because the auto irrigation strategy that was used to determine the timing and number of water applications during the data-simulation process was set up in such a manner that water was never limiting to crop development. Inspection of the simulated data, however, revealed that water applications were lower when crop yield was reduced because of nitrate deficiencies. SWB reduces the leaf area index when nitrate deficiencies occur and consequently crop transpiration was reduced and resulted in less irrigation water being applied. Thus, crop yield was modeled as a function of nitrogen applications because water never limited crop production while changes in water applications were modeled as a function of nitrogen applications because an underdeveloped crop requires less irrigation water.

Production cost data and input prices for 2014 are from Griekwaland-Wes Cooperation (GWK Ltd), South Africa. Table 1 presents the crop price and the input costs used in this paper.

Table 1 Crop price and input costs for maize production at Glen, South Africa

3.4 Economic-environmental compliance models

Data parameters for average gross margins and empirical distributions of nitrate losses are estimated for 220 different fertilizer application rates,Footnote 5 with the use of the procedures outlined above. The generated data parameters are incorporated into an UPM model and an UFM model to estimate the conservativeness of the UPM. Both compliance models include equations that are generic to both compliance models and equations that are specific to the method used to model compliance. The optimization model was developed in GAMS (GAMS Development Corporation 2007a) and solved using the CPLEX solver (GAMS Development Corporation 2007b). Next, the generic model will be discussed followed by the specific equations necessary to model compliance with the UPM model and the UFM model.

3.4.1 Generic model

The generic model specification includes the objective function as well as constraints to limit intensive and extensive margin responses. The following equations are generic to both compliance models:

$$Maximise \;TGM = \left( {\overline{GM} \left( N \right)} \right)*HA$$
(12)

s.t.

$$N \le 220$$
(13)
$$HA \le 1$$
(14)

where \(TGM\) is the total gross margin as a function of applied nitrogen (ZAR) and the area cultivated, \({\text{HA}}\) (measured in ha). The area cultivated can be interpreted as the absolute area cultivated or as a fraction of the area available for cultivation.

The decision variables are the fertilizer application rate and the irrigated area that will maximize the total gross margin. Fertilizer applications were limited to a maximum of 220 kg/ha while the area planted was constrained to be no more than one hectare.

3.4.2 Environmental compliance with the upper partial moment (UPM)

The compliance models require additional equations to model compliance with the user-specified environmental goal of 28 kg of nitrate. The generic model was used to determine baseline levels of nitrate losses for production on all soil types and using both fertilizer application methods. The assumption was made that policy makers would want to reduce the probability of an average amount of nitrate loss. Therefore, the nitrate losses for all four alternatives were averaged to determine a homogenous nitrate loss goal of 28 kg.

The equations that are added to the generic model to complete the UPM model are given below:

$$t - \left( {\tilde{E}_{sf} \left( N \right)} \right)HA - d_{sf} \ge 0$$
(15)
$$\mathop \sum \limits_{sf} p_{sf} d_{sf} - \theta \left( t \right) = 0$$
(16)
$$t + p^{*} \theta \left( t \right) \le g$$
(17)

\(t\) is the endogenously determined reference level for the environmental variable with \(d_{sf}\) being the deviation of pollution emissions above the pollution reference level \(t\) for outcome \(sf\) and \(g\) – the environmental goal set by the environmental regulator. \(\theta \left( t \right)\), where \(\theta \left( t \right) = \theta \left( {1,t} \right) = \rho \left( {1,t} \right)\), represents the endogenously determined environmental risk level or the expected deviation above the reference level \(t\). Furthermore, \(p^{*}\) [\(p^{*} = \left( {\frac{1}{1 - cp}} \right)\)] is the inverse of one minus the compliance probability with respect to \(g\).

As mentioned earlier in this article the probabilistic constraint of the UPM in Eq. 16 is enforced by choosing a reference pollution level, \(t\), so that the UPM, \(\theta \left( t \right)\), expressed as a portion of the difference between \(g\) and \(t,\) is equal to the acceptable probability \(\left( {1 - cp} \right)\) of the pollution level being greater than the goal. The deviation of pollution emissions (\(d_{sf}\)) above the endogenously determined reference pollution level (\(t\)) is estimated with Eq. 15. These deviations are multiplied by their occurrence probability to estimate the UPM, \(\theta \left( t \right)\), as absolute deviations from the reference pollution level.

3.4.3 Environmental compliance with the upper frequency method (UFM)

The UFM of enforcing probabilistic environmental compliance is based on the premise that any compliance probability can be expressed for the discrete case as the frequency with which a goal may be exceeded. Restricting the number of states in which the environmental goal might be exceeded guarantees compliance. The UFM utilizes the Environmental Target-MOTAD model specification to identify states of nature in which the environmental goal is exceeded and uses binary variables to restrict the number of times the goal is exceeded. The following equations were used to ensure compliance:

$$g - \left( {\tilde{E}_{sf} \left( N \right)} \right)HA - d_{sf} \ge 0$$
(18)
$$- IB_{sf} + d_{sf} \le 0$$
(19)
$$\mathop \sum \limits_{sf} B_{sf} \le uf$$
(20)

where \(B_{sf}\) is a binary variable indicating whether the environmental goal is exceeded by outcome \(sf\), while \(uf\) is the upper frequency indicating the number of times a goal might be exceeded to enforce compliance, and \(l\) is a large number that is used to give permission for outcome \(sf\) to exceed the goal, given that \(B_{sf}\) has a value of one.

Absolute deviations (\(d_{sf}\)) are estimated in Eq. 18 as the deviation in nitrate loss (\(\tilde{E}_{sf} \left( N \right)\)) from the environmental goal (\(g\)). Equation 18 is the same as for the UPM (Eq. 15), with the exception that the deviations are calculated from \(g\) and not \(t\) as in the UPM. The UFM, therefore, overcomes the conservativeness of the UPM in maintaining the true environmental goal and not an endogenously determined reference pollution level that is dependent on the distribution of the environmental variable. Equation 19 uses a binary variable to identify whether a specific outcome exceeds the environmental goal. Every time \(\tilde{E}_{sf} \left( N \right)\) exceeds \(g\), \(B_{sf}\) takes a value of one. The \(B_{sf}\)s are counted to determine the frequency with which the environmental goal is exceeded. The probabilistic constraint is enforced by Eq. 20, which restricts the number of times \(g\) is exceeded to \(uf\). The value of \(uf\) is calculated as \(\left( {1 - cp} \right)sf\), where \(sf\) is the total number of outcomes. The choice of \(uf\) is an integer value that corresponds with a value closest to the estimated discrete compliance probability without exceeding the compliance probability. Therefore, the UFM can also be conservative in the estimation of the trade-offs if the number of discrete states is small. However, the UFM will never be as conservative as the UPM.

4 Results

4.1 UPM economic-environmental trade-offs

The UPM-generated economic-environmental trade-offs of maintaining a nitrate loss goal of 28 kg at increasing levels of compliance for two soils (SCL and SC) and two fertilizer application methods (Single and Split) are shown in the lower section of Fig. 2.

Fig. 2
figure 2

Gross margins (GM measured in ZAR) for the upper partial moment (UPM) and the upper frequency method (UFM) at increased specified compliance probability levels for two soils (SCL and SC) and two fertilizer application methods (single and split)

The UPM trade-off curves show that the gross margins for the SCL soils are consistently higher when compared to SC soil and that a single fertilizer application is preferred to a split application when a specific soil is being considered. Total gross margins are decreasing at an increasing rate with increasing levels of specified compliance probability with the exception of increases in the \(cp\) beyond 90% for the SCL soil. The reduction in total gross margins from the lowest to the highest specified \(cp\) is on average 48% for the SCL soil and 63% for the SC soil, with little difference between fertilizer-application methods for a specific soil type.

4.2 Compliance probability conservativeness

The UPM method is said to be conservative with respect to the actual compliance that is achieved with the modeling procedure, while probabilistic constraints are being enforced. The compliance probability conservativeness is evaluated by comparing the specified compliance probability with the actual probability with which the environmental goal is achieved. The actual compliance probability of the UPM model is computed ex-post to the optimization, using the optimized distribution of the environmental variable. The comparison between specified and actual compliance is shown in Fig. 3. A 45° line is also shown to indicate perfect correspondence between the specified- and the actual compliance probabilities.

Fig. 3
figure 3

Actual compliance for increased specified compliance probability levels for the upper partial moment (UPM) for two soils (SCL and SC) and two fertilizer application methods (single and split)

Figure 3 reflects huge discrepancies between specified- and actual compliance probabilities. In all cases, the actual compliance level is much higher than the specified compliance level, especially at low levels of specified compliance. Furthermore, the actual \(cp\) achieved on the SC soil is higher when compared to the SCL soil for a specific level of compliance. At the lowest level of \(cp\), the difference is 24.6 percentage points for the SCL soil and about 28.4 percentage points for the SC soil. For increasing levels of specified compliance, there is very little change in the actual compliance probabilities to a point where the actual probabilities increase to the highest level of specified compliance. At the highest level of \(cp\), the differences in compliance probabilities decrease to 2.4 percentage points and 3.5 percentage points respectively for the SCL and SC soils.

Users of the UPM may justify the use of the method by arguing that the difference between the specified- and the actual compliance levels becomes very small at high levels of specified compliance and, therefore, the UPM can be used if the specified \(cp\) is high. Cognizance should be taken of the method used to enforce compliance using the UPM method. With the UPM model, the intensive and extensive margin responses for achieving the environmental goal are optimized in such a way that the pollution reference level (\(t\)) is achieved with the specified \(cp\). The UFM estimates non-compliance directly from the environmental goal (\(g\)), which will result in significant changes in the intensive and extensive margin responses and affect the total gross margin and the resulting distribution of nitrate emissions. Evaluating the conservativeness of the UPM in terms of \(cp\) alone does not provide any indication of the impact of the conservative estimates of the UPM on the economic indicator or nitrate losses to the environment. The specified compliance of the UPM model was incorporated into the UFM by expressing the \(cp\) as the number of observations with which the goal may be exceeded. Consequently, the specified and actual compliance levels are the same for the UFM model results. Thus, comparing the results of the UPM with the UFM allows for better evaluation of the conservativeness of the UPM because the impact on the gross margins of the polluter and the environmental consequences are considered.

4.3 Economic indicator conservativeness

The upper section of Fig. 2 shows the economic-environmental trade-offs generated with the UFM model. The specified compliance of the UPM model was incorporated into the UFM by expressing the \(cp\) as the number of observations with which the goal may be exceeded. Consequently, the specified and actual compliance levels are the same for the UFM model results.

The optimized gross margins of the UFM model are much higher in comparison with those of the UPM model. The difference in the optimized gross margins between the two compliance models measures the impact of the conservativeness of the UPM on the polluters’ profitability. From the graph, it is clear that the underestimation of gross margin is not constant across the range of specified compliance probabilities since the trade-off curves of the UFM cross each other, which is not the case with the UPM model. Consequently, choices between different fertilizer application methods on a specific soil type for increasing levels of environmental compliance with the UFM model are not as consistent as with the UPM. However, the SCL is still the preferred soil type. At lower levels of specified compliance, a single fertilizer application is preferred, while split applications are preferred at higher levels of specified compliance. Important to note is that the gross margins tend to converge to a gross margin of R4 342 at the highest level of specified compliance.

Even though the differences in gross margins between the two model specifications are reduced for all strategies with increasing levels of compliance, the differences remain large. The average gross margin differences between the compliance models at the highest \(cp\) are R1 372 and R2 737 respectively for an SCL soil type and an SC soil type with respective fertilizer application strategies inducing differences of R78 and R122 respectively for SCL soil and SC soil. On average these differences respectively constitute a 32 and 62% underestimation of gross margins with the UPM model for the SCL and SC soil.

4.4 Environmental conservativeness

The average nitrate losses above the environmental goal are calculated for each model specification and are compared to identify the impact on the environment when using the UFM model with its close bound to the actual \(cp\). The model comparisons are shown in Fig. 4.

Fig. 4
figure 4

Average nitrate losses above the goal (kg) for the upper partial moment (UPM) and the upper frequency method (UFM) at increased specified compliance probability levels for two soils (SCL and SC) and two fertilizer application methods (single and split)

Figure 4 shows that soil-fertilizer application-method combinations with lower profitability consistently generated the highest average nitrate losses above the goal of 28 kg when considering the UPM model. The magnitude of the losses decreases to almost zero for all the strategies when the specified compliance probability is increased to 94.7%. Of the two soils, the SC soil realized the higher average nitrate losses. Fertilizer-application method does not greatly influence the magnitude of the losses. The results of the UFM model are not as clear cut as for the UPM. However, the observation was made that soil-fertilizer application-method combinations with lower profitability generate the highest average pollution level above the environmental goal. The magnitude of the average nitrate losses also decreases with increasing compliance probability. However, the average losses for the UFM do not converge to almost zero, as is the case with the UPM model. Instead, the average nitrate losses to the environment are about 2.5 kg for the SC soil and respectively 0.98 and 1.59 kg for a single fertilizer application and split fertilizer application on SCL soil. Percentage wise, the average nitrate losses above the environmental goal across all compliance probability levels are respectively 80 and 77% more for the SCL and SC soil types in comparison to the UPM model.

4.5 Changes in the intensive and extensive margin

To ensure compliance with the environmental goal (\(g\)), both the UPM and UFM models change the intensive and extensive margin. The baseline amount of fertilizer applied (kg/ha) and the area planted (ha), together with the optimal amount for the UPM and UFM, are given in Table 2. The baseline amount of fertilizer applied and the area planted reveal the producers’ production decision for the generic optimization model. The producer is therefore not faced with an environmental constraint and can make production decisions for optimal gross margins without considering his or her environmental impact.

Table 2 Estimated results for extensive (ha) and intensive (kg/ha) margin changes using an upper partial moment (UPM) and upper frequency method (UFM) for a sandy clay loam (SCL) and sandy clay (SC) soil using a single- and split fertilizer application (kg/ha)

Profit-maximizing nitrogen input levels vary by 7 and 2 kg/ha between fertilizer-application methods on SCL and SC soil types respectively, with no need to comply with an environmental nitrate loss goal. On average the optimal fertilizer application rate on SCL soil is 141 kg/ha, while the application rate on SC soil is 125 kg/ha in the absence of environmental compliance. The UPM results show that the nitrogen fertilizer application rates are reduced more from the optimal level on the SCL soil when compared to the SC soil. However, larger areas are irrigated with the SCL soils irrespective of the fertilizer-application method to comply with the environmental nitrate loss goal. Irrigation areas decrease with increasing environmental compliance probabilities for all soil-crop fertilizer-application method combinations. Interestingly, per hectare fertilizer application rates decrease with increasing compliance probabilities only for the SC soil, as the SCL soil fertilizer application rates are highest at the highest compliance probability.

Vast differences are observed when comparing the UPM-model- and UFM-model results. The areas irrigated are much higher for the UFM model, which is the main reason for the higher total gross margins optimized with the model. Fertilizer-application rates are also higher for the UFM model when considering the SC soil, with the exception of the 106 kg/ha applied in a single application at a compliance probability of 0.696. Fertilizer-application rates on the SCL soil are higher than the UPM model at low levels (0.795 and below) of compliance and lower at high levels (0.848 and above) of compliance, regardless of fertilizer-application method.

5 Conclusions

The main conclusion from this research is that the UPM method of enforcing probabilistic constraints is very conservative as is evident from the comparison with the newly developed UFM. The UPM method underestimates the actual probability with which the environmental goal is achieved as indicated in the objective function value of the model and the degree (average pollution above the goal) by which the environmental goal is exceeded. Even more important is the fact that the intensive- and extensive- margin responses necessary to satisfy the probabilistic constraint in the UPM model are much different from the optimal response optimized with the UFM model. Thus, use of the UPM may lead to the misidentification of appropriate management practices to combat pollution. The UFM generates solutions that are close to the probability bound and the responses seem more realistic when compared to those from the UPM.

The UFM is easy to use and requires no assumptions regarding the distribution of the environmental variable as the empirical data is used. The UFM behaved well during the optimization process and is much less conservative in the estimation of the trade-offs due to the probability limit that is closer to the actual probability limit displayed by the data. Although the UFM provides a stricter probability bound than the UPM there are some concerns regarding the application of the UFM. The UFM ensures compliance by ensuring that the number of deviations above the goal does not exceed the number of deviations allowed; for this reason, a fairly large number of observations are necessary to ensure probability limits close to the actual probability. Further research is necessary to determine the sensitivity of the UFM to sample size and mining of statistical outliers.