Research context: nitrogen management in California’s Central Valley
California ranks as the most economically valuable agricultural state in the United States by annual crop cash sales. The state boasts more than 400 commodity crops grown across 77,000 farms and ranches on 25 million acres of land, spread along a 500 mile longitudinal gradient (California Department of Food and Agriculture 2018). The Mediterranean climate is ideal for perennial and annual crops in most areas of the state, yet creates reliance on irrigation and a highly engineered water system. Top commodities include dairy, grapes, almonds, berries, livestock, lettuce, walnuts, tomatoes, pistachios and citrus. Farms also vary widely in scale and structure—from small and mid-sized family-owned operations to very large, multi-commodity international corporations (California Department of Food and Agriculture 2018).
Importantly for our focus on N management, California is one of the first states in the U.S. to implement an agricultural non-point source pollution regulatory program, the Irrigated Lands Regulatory Program (ILRP). The ILRP is implemented through local entities known as “Water Quality Coalitions” and includes mandatory elements around reporting use of best management practices and an N budget, as well as attendance at one educational meeting per year, held in each Water Quality Coalition (Central Valley Regional Water Quality Control Board 2020; for more information, see Online Appendix). The N management practices we study in this paper are consistent with those tracked as part of the ILRP mandatory reporting. This policy landscape offers a unique context to study the potential effects of governance on farmer decision-making, compared to well-documented studies evaluating practice adoption under voluntary policy settings (Reimer et al. 2018; Hillis et al. 2018).
Survey and data collection
This paper employs data collected through a mail survey conducted in 2018 across the Central Valley of California. The project integrated stakeholder feedback throughout the research process and included multiple phases of interviews, focus groups, and preliminary survey data collection that both informed our survey design and dissemination strategy, and helped in interpreting results. An external advisory committee, representing policymakers, farmers, directors of the Water Quality Coalitions, and nationwide researchers and extension specialists also provided survey review. Institutional Review Board approval for the study was obtained through the University of California Davis.
The survey was distributed to farmer members from three Water Quality Coalitions: the Colusa Glenn Subwatershed Program (CGSP), the San Joaquin County and Delta Water Quality Coalition (SJDWQC), and the East San Joaquin Water Quality Coalition (ESJWQC) (see Fig. 1). Together, these Coalitions covered over 900,000 acres of irrigated cropland and approximately 7500 individual farming operations in 2017. These regions represent a longitudinal transect of the Central Valley that captures a range of agricultural, ecological and socio-political dimensions. The most important crop types in these regions include almonds, walnuts, grapes, tomatoes, sunflowers, pistachios, alfalfa, corn and wheat. Rice is also a top production crop in the CGSP region, but was removed from our sampling frame since our management practices focus on non-flooded cropping systems.
Mailing addresses were provided by CGSP, to which the survey was sent to all members (n = 1471). In SJDWQC and ESJWQC, mailing addresses were obtained from county Agricultural Commissioner offices, who maintain publically-available databases of all commercial farming operations in compliance with Pesticide Use Reporting requirements. In these regions, organic farmer addresses were also obtained through the U.S. Department of Agriculture Organic INTEGRITY Database (United States Department of Agriculture 2018). We removed all obvious non-agricultural entries (e.g. golf courses or public lands using pesticides) from the mailing lists. This list contained our best estimate of all eligible farmers who would report to SJDWQC or ESJWQC under the ILRP. The survey was sent to all farmers in SJDWQC region (n = 2322) and to a random sample of 33% of farmers in ESJWQC region (n = 1243), due to the size of the Coalition. In aggregate, this totaled 4994 surveys mailed across all three regions.
We followed a four-wave mailing process using a modified Tailored Design Method, which included a cover letter and survey, followed by a reminder postcard, then second letter and survey, and final reminder postcard (Dillman et al. 2008). In CGSP, the Coalition permitted us access to join our survey response data to their anonymized mandatory reporting data on management practices adopted on each field. These data allowed us to compare survey-reported practice adoption rates with adoption rates reported on mandatory paperwork by farmers who did not respond to the survey, offering opportunity to evaluate the self-selection bias that is prevalent in survey-based research. We found that adoption rates did not differ between survey respondents and non-respondents, thus indicating our survey respondents were representative of adoption behavior occurring across the watershed (see Online Appendix Table A1).
Variable measurement
The survey questionnaire included 30 questions covering a range of topics related to farmers’ views on N management, including their adoption of eight different N management practices on their largest parcel of their most important crop in the 2017 crop year, measured as a binary variable. In this paper, we treat these eight N management practices as the simultaneous adoption outcome variables in a multivariate probit regression model.
The binary measurement of these practices is a data limitation of this paper, especially for practices that are applied in a temporally (e.g. multiple times/year versus every other year) or spatially (e.g. full operation versus particular fields) heterogeneous fashion. Given the known heterogeneity of operations on our mailing lists (e.g. operations varied from a single crop up to 14 unique crops, and from < 1 acre to > 20,000 acres), we were constrained to developing a survey tool that was general enough to fit every possible respondent. Furthermore, the ILRP collects practice adoption data in a binary fashion as well. Thus, aligning our data structure with that of the regulatory program allows us the best opportunity for data comparisons and to draw policy-relevant conclusions.
The survey measured farm operation characteristic variables of interest: most important crop type (aggregated into a binary variable: perennial/annual crop), farm size (log-transformed), primary irrigation type (aggregated into a binary variable: pressurized systems- drip, micro-sprinkler, versus gravity-fed systems- flood, furrow, border strip) and water source (aggregated into two binary variables: access to surface water versus groundwater, and access to both water sources versus single source).
We also measured a number of behavioral variables typically used in agricultural adoption research to include as controls (Prokopy et al. 2019). These included information-related variables measuring access to information from three perspectives: a tally of the total number N management information sources, a binary variable for the use of Certified Crop Advisers to create N budgets (“consultants”), and a binary variable for the completion of a Self-Certification course, which is a voluntary educational component of the ILRP which allows farmers to self-certify their own N budgets. Socio-behavioral concepts included problem awareness (“acceptance of agricultural N sources”), environmental values (“conservation motivation”), and perceived behavioral control (“self-efficacy”) (Reimer et al. 2012b). These latent variables were constructed using exploratory factor analysis to combine multiple survey question items measured on five-point Likert scales (Costello and Osborne 2005), which improves reliability (McIver and Carmines 1981; DeVellis 2003; Santos 1999). Cronbach alpha scores were used to verify internal consistency between the items combined in a composite variable; all alpha scores were > 0.70, a widely-accepted cut-off to indicate internal validity (Santos and Reynaldo 1999). See Online Appendix Table A2 for information on survey questions and composite variables.
Finally, farmer demographic variables included a binary variable for college education, a categorical variable for income class, and a continuous variable for years in agriculture. Binary variables were included to distinguish the three Water Quality Coalitions, with the baseline as farmers who didn’t identify with the three Coalitions of focus or left the question unanswered.
Survey respondents
We received a total of 966 partial and full survey responses back (CGSP: n = 377, SJDWQC: n = 312, ESJWQC: n = 183), constituting an average response rate of 20% (CGSP: 30.7% SJDWQC: 14.4%, ESJWQC: 15.4%). We removed 101 responses from farmers reporting on irrigated pasture, bringing our useable number of respondents to 865. Response rates were adjusted for the possible non-eligible addresses included in our original mailing lists (American Association for Public Opinion Research 2016). Our response rate is on par with recent surveys using similar designs and regarding similar topics (Denny et al. 2019; Wilson et al. 2014; Arbuckle and Rosman 2014). All survey data was digitized for analysis.
Respondents are fairly representative of the full farming populations in the surveyed regions, when compared to USDA 2012 Census of Agriculture data (see Online Appendix Tables A3–A5). The average farm size of our respondents is 355 acres (minimum < 1 acre, maximum ~ 12,000 acres). In aggregate, our survey respondents manage 329,800 acres of land across the Central Valley, approximately 35% of the acreage of the study area. Seventy-nine percent of respondents own their land; 80% of respondents are male; 84% of respondents identify as White or Caucasian, 4% as Hispanic or Latino and 3% as Asian or Asian American. Sixty-one percent of respondents have at least some college education. On average, respondents have 35 years of farming experience, and the median gross farm income bracket is $100,000–$200,000.
Respondents listed all crops and acreage they cultivate, though we only asked about practice adoption on their most important crop, as self-identified on the survey. Sixty-four percent of farmers report only growing one crop, 27% two to four crops, and 4% have five or more crops. Eighty-five percent of respondents indicate a perennial crop as their most important. Seventy-four percent of respondents have pressurized irrigation systems on their most important parcels. Forty percent of respondents rely on groundwater only, 44% have access to surface water (only) through riparian rights or irrigation district delivery water, and 16% have both surface and groundwater access (See Online Appendix Table A6 for all descriptive statistics).
Analysis approach: multivariate probit for estimating interdependencies
Much of the existing adoption literature uses a standard quantitative approach of estimating some type of linear model with an individual practice, or count of practices, as the dependent variable, and multiple predictor variables to test hypotheses about drivers of adoption (Prokopy et al. 2019). Here, we need an empirical model that simultaneously estimates farm and farmer variable influences on the adoption of multiple practices, and how those practices are related to each other. To accomplish this goal, we employ a multivariate probit (MVP) model that allows estimation of multiple binary probit regression models (in our case 8 models) simultaneously, while analyzing correlation between errors in the different models. Failure to account for these correlated error terms can result in inefficient coefficient estimates and biased error terms (Cappellari and Jenkins 2003). This approach has been applied in other studies looking at simultaneous adoption in developing agricultural settings (Koppmair et al. 2017; Kassie et al. 2015; Kara et al. 2008; Teklewold et al. 2013; Jara-Rojas et al. 2013).
Considering all N management practices, each equation in the system can be written as:
$$\begin{aligned} {\varvec{Y}}{^{*}}_{\varvec{i}} & = {\varvec{B}}_{{1\varvec{i}}} {\varvec{X}}_{\varvec{a}} + {\varvec{B}}_{{2{\varvec{i}}}} {\varvec{X}}_{\varvec{b}} \ldots \, {\varvec{B}}_{\varvec{ni}} {\varvec{X}}_{\varvec{n}} + {\varvec{e}},\\ (i& = LT,\;ST,\;CC,\;IN,\;MP,\;SA,\;PB,\;ET), \end{aligned}$$
where Yi indicates the i different practices (LT = Leaf Testing, ST = Soil Testing, CC = Cover Crops, IN = Irrigation N Testing, MP = Moisture Probe, SA = Split Application, PB = Pressure Bomb, ET = Evapotranspiration-based scheduling) of interest and Xn are the predictor variables of interest. Our unit of analysis is an individual farmer. For farmers who operate across multiple fields, we evaluate their practice adoption only on the largest field of their most important crop, thus including only one observation per farmer. This yields a matrix of estimated model coefficients, with the coefficient for each covariate (B1i… Bni) estimated for each of the eight practices. The MVP assumes that the error terms for each practice (eLT, eST, eCC…) jointly follow a multivariate normal distribution with a mean of 0 and variance of 1. The model also generates a variance–covariance matrix that provides the correlation coefficients (rho) between the error terms of all pairs of equations. These correlations can offer insight on the complementary (i.e. positive correlations) or substitutable (i.e. negative correlations) nature of pairs of practices.
Simulated maximum likelihood techniques are used to estimate the model, and following Cappellari and Jenkins (2003), our MVP models are estimated using the Geweke–Hajivassiliou–Keane (GHK) simulator in Stata 16. Multivariate normal probabilities are calculated at each iteration of the simulation. Simulation bias is minimized by increasing the number of random draws from the simulator, to at least as large as the square root of the sample size; we ran the model with 35 random draws (Cappellari and Jenkins 2003). We also tested for any ordering effects in the dependent variables by running the model with multiple different orders for the practice dependent variables; results were consistent across all runs. As an additional robustness check, we fit individual univariate probit regression models for each of the eight practices, which produces very similar coefficient estimates (See Online Appendix for additional discussion on robustness and Table A7 for univariate probit results).
To test H1, we evaluate the MVP variance–covariance matrix alongside a co-occurrence matrix. The co-occurrence matrix uses observed adoption data and calculates the proportion of all farmers who jointly adopt any two practices, evaluating all possible dyads of practices, a method that has been applied widely in ecology to evaluate species co-occurrence (Hines and Keil 2020). Both relatedness matrices are visualized as undirected weighted networks with the edge weights between every pair of practices reflecting the relatedness of those two practices. We use Quadratic Assignment Procedure (QAP) matrix correlation to assess which practices frequently occur together and which practices have highly correlated errors, indicating a potential underlying dimension influencing their adoption.
To test H2, we draw on descriptive statistical analyses including Pearson’s chi-squared tests to investigate differences in individual practice adoption rates between farm types and evaluate our MVP coefficient estimates to understand the predictive power of key farm operation characteristics of interest (crop type, farm size, irrigation system and water source), while controlling for all other farmer behavior and demographic variables, as well as interdependency across practices. We then qualitatively evaluate differences in practice portfolios across different operation types by looking at differences in the co-occurrence practice networks. We highlight results for practice portfolio differences across farm types with different irrigation systems, but additional side-by-side practice portfolio comparisons between other operation types are included in the Online Appendix Figures A3–A4.
Descriptive statistics and data visualization were carried out in R Statistical Software Version 3.5.3; multivariate modelling was conducted in Stata16. All model code is linked in the Online Appendix.