Introduction

Enhancing farm productivity and transitioning to market-oriented production requires access to markets. However, resource-poor smallholder farm households in developing countries face numerous market failures and barriers, especially restricted market access due to higher transaction costs (Ouma et al. 2010); resulting in low market participation (de Janvry et al. 1991). These constraints are exacerbated in low-income countries by inadequate infrastructure, limited access to credit and information, market distortions and weak institutions that increase transaction costs and limit production and marketing decisions of farm households (Barrett 2008; Okoye et al. 2016; Ouma et al. 2010). Key et al. (2000) identify high transaction costs related to search and information, transportation, bargaining, monitoring, and contract enforcement as major constraints to engaging in market activities.

Farm households in Afghanistan face enormous challenges in accessing markets (World Bank 2014). Most Afghan farm households operate small-scale subsistence farming, lack adequate institutional and physical infrastructure to support market participation, and are vulnerable to environmental factors like droughts, floods and natural disasters. While agriculture is the major sector for employment and GDP, productivity and market participation are low (World Bank 2014; Zanello et al. 2019; Ahmadzai 2022; Safi 2023). Agricultural policy in the 2000s recognized the role of market-led development to improve rural incomes and livelihoods (World Bank 2014; Government of Islamic Republic of Afghanistan 2016). Economic performance was encouraging for two decades from 2000, with military spending and international aid providing a large boost to the economy and supporting poverty reduction economy (Floreani et al. 2021), although improvements in agriculture were modest. The country faces huge challenges following the Taliban takeover in August 2021, with severe economic contraction, food insecurity and a humanitarian crisis.

By considering the experience of Afghanistan in the 2000s, this paper contributes to literature on improving market access for smallholder farmers in poor, fragile environments by providing evidence on household’s behaviour in relation to market participation, focussing on how reducing transaction costs can boost agricultural productivity, increase market efficiency, and ultimately contribute to economic growth and poverty alleviation. Central to the analysis is testing for separability of household production and consumption decisions and investigating if reducing transaction costs improves access to markets and increases market participation. Previous studies that focused on transaction costs (Winter-Nelson and Temu 2005; Alene et al. 2008; Liverpool-Tasie 2014) have not addressed the potential endogeneity problem due to unobserved factors that are simultaneously associated with access to transportation equipment and information and communication technology (used as proxies for transaction costs) and household marketing decisions. A contribution of the current study is to allow for endogeneity in transaction costs.

Examining household behaviour under imperfect market conditions offers insights into the diverse strategies households employ to mitigate the costs arising from market failures (Vakis et al. 2004). Drawing on Benjamin (1992); LaFave and Thomas (2016); and Dillon and Barrett (2017); the analysis tests the hypothesis of separability. Assuming that all current and future markets are complete, with households taking prices as given, household production and consumption decisions are separable if production choices are independent of household consumption preferences. The separability assumption, is tested by the joint significance of household size and composition in the household labour demand model. If significant, separability is rejected and this can be interpreted as evidence of market failures (Benjamin 1992; Dillon and Barrett 2017).

Utilizing three waves of repeated cross-sectional data from the Afghanistan Living Conditions Survey (ALCS), the analysis incorporates time-fixed effects through wave dummy variables and location-fixed effects through district dummies. The potential endogeneity problem in transaction cost variables is addressed by employing instrumental variables. Results suggest that ownership of or access to information and communication technology (ICT) and transport equipment, acting as proxies for transaction costs, facilitates increased market participation by reducing search, information, transportation, bargaining, monitoring, and contract enforcement costs.

Literature review

In smallholder agriculture, decisions are made in a complex environment where production is carried out by households that both demand and supply labour. Under complete and competitive markets, these households exchange (hire in and hire out) the required labour freely to maximize profits. In this case, households are profit-maximizers and labour employed is independent from consumption decisions. Household preferences and labour endowment should not affect hired labour for production. Separability implies that households first make production and hiring choices, then make consumption choices conditional on income from production (Benjamin 1992; Bowlus and Sicular 2003; De Janvry and Sadoulet 2006; Le 2010; LaFave and Thomas 2016; Dillon and Barrett 2017). Non-separability implies production and consumption decisions are not independent, suggesting imperfect markets and households allocate resources to make up for market failures.

Market failures result from legal restrictions, weak enforcement of contracts, high transaction costs, and poor access to infrastructure, and can be distinguished in three types (de Janvry et al. 1991; Dillon and Barrett 2017). The first is when exchange of goods is legally prohibited or rendered infeasible by some non-market force; markets are completely missing. The second arises when markets are functional but imperfect; exchange takes place at non-competitive (or not market clearing) prices, potentially due to high transaction costs. The third situation may occur when markets exist and operate at the competitive and market-clearing prices, but welfare outcomes for households are sub-optimal, so intervention is required.

Lack of access to market information and high transaction costs may be linked to potential market failures or dysfunctional markets. Holloway et al. (2000) distinguish transaction costs between tangible (transportation, communication, legal) and intangible (uncertainty, moral hazard, etc.) costs. Key et al. (2000) broadly categorizes transaction costs into Fixed Transaction Costs (FTC) and proportional or Variable Transaction Costs (VTC’s); FTC’s are invariant to the quantity of an input purchased (such as screening and search or information costs), while VTC’s vary with the volume of inputs traded (such as the cost of transportation).

The variable nature of transaction costs presents challenge for measurement and assessing impact on household decisions. Transactions cannot be observed when costs are so high to prevent exchange from occurring (Alene et al. 2008). Information on transaction cost is hard to collect in a survey, especially if farmers have no access to transportation and ICT equipment as there would be no paid costs to observe (Key et al. 2000; Alene et al. 2008). It is difficult to observe actual transaction costs when farm households use their own means of transport (Alene et al. 2008). Thus, the literature employs observable factors that proxy for transaction costs, such as ownership of transport and ICT equipment and distance to/from roads and markets (Winter-Nelson and Temu 2005; Alene et al. 2008; Ouma et al. 2010). Following Winter-Nelson and Temu (2005); Alene et al. (2008); and Liverpool-Tasie (2014), the analysis here uses ownership of transport (bike, motorbike, or vehicles) and communication equipment (radio, TV, mobile phone, and internet services) as measures for FTCs and distance to roads and nearest permanent market as proxies for VTCs.

Conceptual and theoretical framework

The hypothesis of separation in the agriculture household model

Household utility is based on the standard time allocation model to reflect the separation hypothesis, grounded in the generic household model (Singh et al. 1986) as articulated by Benjamin (1992); and later applied by Bowlus and Sicular (2003); Le (2010); LaFave and Thomas (2016); and Dillon and Barrett (2017). Consider a farm household that aims to maximize utility represented by a strictly increasing and concave utility function (1). Utility is derived from preferences over consumption (\(C\)) and leisure (\({L}^{\text{l}}),\) conditional on household preference shifters (\(Z\)) such as household endowments. The household is endowed with a fixed amount of labour (\(\overline{L })\) supplied to farm work (\({L}^{\text{f}})\), to produce output that can be consumed by the household or sold to the market at the market price (\(p\)), and off-farm work (\({L}^{\text{m}}\)) to receive market wages (\(w\)). Households can hire labour from the market, denoted by (\({L}^{\text{h}}\)), at the market wage \((w)\) and purchase non-labour inputs (\(X\)) such as seeds, fertilizer, and pesticides at the market price of \({p}_{x}\). Total land is denoted by \(A\), which consists of household own land (\(\overline{A }\)) and land rented in (\({A}^{\text{r}}\)) at the rental rate of \(({p}_{a})\).

$${\text{MAX}}_{{C,A,L^{{\text{m}}} ,L^{{\text{h}}} ,X}} \left( {C,L^{{\text{l}}} |Z} \right)$$
(1)
$$pC \le pF\left( {A,L,X} \right) + wL^{{\text{m}}} - p_{a} A^{{\text{r}}} - wL^{{\text{h}}} - p_{x} X$$
(2)
$$0 \le L^{{\text{m}}} \le L^{{\text{M}}}$$
(3)
$$L \equiv L^{{\text{f}}} + L^{{\text{h}}}$$
(4)
$$\overline{L} \ge L^{{\text{f}}} + L^{{\text{m}}} + L^{{\text{l}}}$$
(5)
$$A = \overline{A} + A^{{\text{r}}}$$
(6)
$$L^{{\text{l}}} ,L^{{\text{f}}} ,L^{{\text{h}}} ,L^{{\text{m}}} ,C,A,X \ge 0$$
(7)

Market imperfections are introduced as upper and lower constraints on market labour: \(0\le {L}^{\text{m}}\le {L}^{\text{M}}\) where \({L}^{\text{M}}\) is the maximum number of hours a farm household can work in the labour market to earn wages. The farm household faces imperfections if either the lower constraint or the upper constraint is binding (\({L}^{\text{m}}=0\) or \({L}^{\text{m}}=M\)). Then the hypothesis of separation fails. However, the farmer faces no imperfection if neither constraint is binding (\(0<{L}^{\text{m}}<{L}^{\text{M}}\)), and household behaviour is consistent with separation (Le 2010).

Based on the Lagrangian function, the first-order condition (FOC) for labour (\(L\)) can be calculated as in Eq. (8) and FOC for \({L}^{\text{m}}\) can be derived as in (9 and 10) depending on the market conditions and separation:

$$w^{*} = pF_{{\text{L}}} \left( {A,L,X} \right)$$
(8)
$$w^{*} = w \;\;if\; 0 < L^{{\text{m}}} < L^{{\text{M}}}$$
(9)
$$w^{*} \ne w \;\; if\;L^{{\text{m}}} = 0 or L^{{\text{m}}} = L^{{\text{M}}}$$
(10)

where \({w}^{*}\) is:

$$w^{*} = \frac{{\left( {U_{{{\text{l}} }} pF\left( {A,L,X} \right) + \left( {wL^{{\text{m}}} - p_{a} A^{{\text{r}}} - wL^{{\text{h}}} - p_{x} X;Z} \right)} \right)}}{{U_{{\text{c}}} pF\left( {A,L,X} \right) + \left( {wL^{{\text{m}}} - p_{a} A^{{\text{r}}} - wL^{{\text{h}}} - p_{x} X;Z} \right)}}$$
(11)

where \({F}_{\text{L}}\) in (8) is the derivative of output with respect to labour and \({w}^{*}\) is the shadow wage or opportunity cost of time. If separation holds, the constraints are not binding as in Eq. (9) and therefore \({w}^{*}=w\). Plugging \(w\) for \({w}^{*}\) in Eq. (8) gives \(w=p{F}_{\text{L}}(A, L, X)\) where \(Z\) does not appear, implying that the choice of labour does not depend on the preference shifters (e.g. household endowments of labour). In this case, the household hires in labour or supplies labour to the market, and exchanges other inputs at exogenous, market-clearing prices, so that it allocates labour to maximize profits first and consumption choices are conditional on the profit from production. In the case of non-separation where \({w}^{*}\ne w\) when the labour market constraint is binding \((\text{e}.\text{g}., {L}^{\text{m}}=0 or {L}^{\text{m}}={L}^{\text{m}})\), then \(L\) can be derived by substituting Eq. (8) into (11) such that:

$$w^{*} \ne w \Rightarrow pF_{{\text{L}}} \left( {A,L,X} \right) = \frac{{\left( {U_{{{\text{l}} }} pF\left( {A,L,X} \right) + \left( {wL^{{\text{m}}} - p_{a} A^{{\text{r}}} - wL^{{\text{h}}} - p_{x} X;Z} \right)} \right)}}{{U_{{\text{c}}} pF\left( {A,L,X} \right) + \left( {wL^{{\text{m}}} - p_{a} A^{{\text{r}}} - wL^{{\text{h}}} - p_{x} X;Z} \right)}}$$
(12)

Preference shifters (\(Z\)) do appear in Eq. (12), meaning that labour allocation in the first stage of production is affected by household endowments; thus, production and consumption are jointly determined.

There are two sets of tests to assess the above relationship. The first set, implemented by Benjamin (1992); Bowlus and Sicular (2003) and Le (2010); and more recently by LaFave and Thomas (2016) and Dillon and Barrett (2017), involves a reduced form approach that tests whether variables that affect the consumption decision (preference shifters denoted by \(Z\)) also affect labour allocation decisions in production. The second set, implemented by Jacoby (1993); Abdulai and Regmi (2000); and Grimard (2000), involves a structural approach testing the relationship between \(w\) and \({w}^{*}\). In the latter approach, since \({w}^{*}\) cannot be observed, it should be estimated using a production function.

As the second approach involves estimation of a production function, questions arise with regard to choosing the correct functional form and allowing for endogeneity (Le 2010). The reduced form approach is chosen to avoid estimating the production function, but implies that rejection of separation may not be directly interpreted as a test for market failure in a specific input market; rejecting separation can indicate failure of multiple markets as relative prices of inputs or outputs (not absolute prices) may generate distortions resulting in market failure (Dillon and Barrett 2017). Moreover, separation behaviour may not mean that complete markets exist, as it may be the result of household decisions to allocate resources to compensate for missing markets (LaFave and Thomas 2016).

Market participation under transaction costs

Following Key et al. (2000); Winter-Nelson and Temu (2005); Alene et al. (2008); and Ricker-Gilbert et al. (2011), market participation by farm households can be modelled as a two-step decision process: 1) whether to purchase inputs from the market and 2) the extent or level of expenditures on inputs. To accommodate transaction costs explicitly, let \({\text{VTC}}_{\text{o}}\) and \({\text{VTC}}_{\text{i}}\) denote variable transaction costs per unit of output and input, respectively, so that the adjusted output price becomes \({p}_{q}^{{^{\prime}}}=({p}_{q}-{\text{VTC}}_{o})\), a downward adjustment, and the adjusted input price becomes \({w}_{\text{i}}^{{^{\prime}}}=\left({w}_{\text{i}}+{\text{VTC}}_{\text{i}}\right)\), an upward adjustment (increase in price due to VTCs).

Households market surplus produce, assumed to be equal to total output produced less total output consumed (\({Q}_{\text{i}}^{\text{s}}={Q}_{\text{i}}-{Q}_{\text{i}}^{0})\), and purchase required inputs from the market of an amount equal to total inputs minus own inputs (\({X}_{\text{i}}^{b}={X}_{\text{i}}-{X}_{\text{i}}^{0})\). For purchased inputs such as certified seed, chemical fertilizer, and pesticides, the household relies entirely on the market (i.e. \({X}_{\text{i}}^{b}={X}_{\text{i}}^{0}\)), whereas in the case of labour total input is the sum of hired and own labour, hence \({X}_{\text{i}}^{b}={X}_{\text{i}}-{X}_{\text{i}}^{0}\) (Goetz 1992). Let \({\text{FTC}}_{\text{o}}\) and \({m}_{q}\) be fixed transaction costs for output sold to the market, and \({\text{FTC}}_{\text{i}}\) and \({m}_{\text{i}}\) be fixed transaction costs for inputs purchased from markets, so the objective function in (1) can be redefined:

$${\text{MAX}}\left( U \right) = U\left( {p_{q} Q^{{\text{c}}} + \left( {p_{q} - {\text{VTC}}_{0} } \right)Q^{{\text{s}}} - w_{{\text{i}}} X_{{\text{i}}}^{0} - \left( {w_{{\text{i}}} + {\text{VTC}}_{{\text{i}}} } \right)X_{{\text{i}}}^{b} - {\text{FTC}}_{0} \left( {m_{q} } \right) - {\text{FTC}}_{{\text{i}}} \left( {m_{{\text{i}}} } \right)} \right)$$
(13)

where

$${ }m_{q} \; = \;\left\{ {\begin{array}{*{20}l} {1 } \hfill & {Q^{{s}} > 0} \hfill \\ 0 \hfill & {{otherwise}} \hfill \\ \end{array} } \right.$$
$$m_{q} \; = \;\left\{ {\begin{array}{*{20}l} 1 \hfill & { Q^{{s}} > 0} \hfill \\ 0 \hfill & {{ otherwise}} \hfill \\ \end{array} } \right.$$

Taking the first-order condition of the objective function yields a reduced form for input demand conditional on market participation, implying that for households that purchase inputs the quantity is unaffected by FTC and, once entry costs are paid, FTC do not affect input quantity or expenditures.

$$m_{{i}} = f\left( {p_{q} ,{\text{VTC}}_{{\text{o}}} ,w_{{\text{i}}} ,{\text{VTC}}_{{\text{i}}} , {\text{FTC}}_{{\text{o}}} ,{\text{FTC}}_{{\text{i}}} ;Z} \right)$$
(14)
$$X_{{\text{i}}}^{b} = f\left( {p_{q} ,{\text{VTC}}_{{\text{o}}} ,w_{{\text{i}}} ,{\text{VTC}}_{{\text{i}}} ;Z} \right) \;\;{\text{if}}\;X_{{\text{i}}}^{b} > 0$$
(15)

Equations (14) and (15) represent household input market participation and input demand. While participation is a function of prices, transaction costs and household characteristics, input demand is a function of prices, VTC only and household characteristics; FTC are therefore not included when estimating the demand equation.

Estimation strategy and identification

To test the hypothesis of separation, I estimate the labour demand equation as follows:

$$lnL_{ijt} = \alpha_{i} + \beta lnA_{ijt} + \delta_{0} lnN_{ijt} + \mathop \sum \limits_{{{\text{s}} = 1}}^{{\text{S}}} \left( {\delta_{s} \frac{{N_{ijt} }}{{N_{ijt} }}} \right) + \mathop \sum \limits_{k = 1}^{K} \left( {\theta_{k} lnX_{ijt} } \right) + \mathop \sum \limits_{j = 1}^{J} \left( {\gamma_{j} D_{j} } \right) + W_{t} + \varepsilon_{ijt}$$
(16)

where \({L}_{ijt}\) represents the total labour employed (own and hired) by the \({i}^{th}\) household measured in person-days in district \(j\) at time \(t\), \({A}_{ijt}\) is the total amount of land cultivated, \({N}_{ijt}\) is the size of the household for \({i}^{th}\) household, \({N}_{ijt}^{\text{s}}\) are household composition or structure variables such as age-sex demographic groups, \({X}_{ijt}\) collects additional control variables such as land quality in district \(j\) at time \(t\), \({D}_{j}\) represents district dummies to control for regional variation, and \({W}_{t}\) represent the year dummies for the repeated cross section; \({\varepsilon }_{ijt}\) is the standard normally distributed error term. The null hypothesis of separation (\({\delta }_{0}={\delta }_{\text{s}}=0)\) states that estimated coefficients on household structure variables (denoted by \({\delta }_{\text{s}}\)) and household size (\({\delta }_{0}\)) are jointly indistinguishable from zero. Rejection of the null hypothesis implies non-separability.

Following Dillon and Barrett (2017), I define four sex-age demographic groups (\({N}_{ijt}^{\text{s}})\). The prime age-sex group comprises male and female household members aged between 14 and 64 years, and the elderly group comprises members aged above 64. Members aged below 14 are excluded to avoid mixing children and adults (reducing concerns about productivity differences) and to mitigate concerns of potential endogeneity for household size. Children’s contribution to labour demand (total labour days–the dependent variable) is accounted for, assuming that each child day is equivalent to half of an adult work day.

I employ a lognormal double hurdle (LDH) to estimate the extensive margin decision to participate in factor markets and then examine the intensive margin decision conditional on participation. The data show that a significant proportion of households do not participate in the market (non-participation might be due to market barriers), so a corner solution may better fit the data compared to sample selection assuming non-participation is a result of incidental truncation. Assuming market participation is the outcome of a sequential two-step decision process, the lognormal double hurdle (LDH) model is more flexible compared to the restricted Tobit model as it allows different (or the same) factors to affect the first- and second-stage decisions (participation and extent), unlike the Heckman selection model that requires a strict exclusion restriction.

Previous studies have shown that ownership of information and communication technologies (ICT) and transport equipment may be endogenous to labour demand and household market participation decisions due to unobserved household traits such as innate abilities and motivations (Leng et al. 2020; Ma et al. 2018). Chowdhury (2006) stated that the use of the telephone is possibly correlated with unobservable characteristics that may also be correlated with market participation. It can be hard to anticipate a priori the direction of the bias. However, it is plausible to assume that households that participate in markets are more likely to own or use ICT and transport equipment, so one would expect the coefficient estimate of the ICT and transport equipment variables to be biased upward. Potential endogeneity in ownership of ICT and transport equipment is addressed using the instrumental variable (IV) control function (CF) approach.

To identify a causal relationship between variables proxying for transaction costs and market participation, external instruments that affect the ownership of ICT and transport equipment but are unlikely to directly affect market participation decisions are used. Following Deng et al. (2019); Nie et al. (2020); and Min et al. (2020), the instruments are the leave-out means (LoM)Footnote 1 for both ICT and transport equipment, defined as the proportion of households in the enumeration area (village level communities referred to as Shura) with access to ICT/transport equipment excluding the household under consideration.

The rational for using these instruments is grounded in the theory that peer behaviour is an important driver of an individual’s behaviour (An 2015; Sampson and Perry 2019a, 2019b); farm households observe and copy examples from selected neighbours, friends or villagers. Hence, the ownership or extent of use of ICT and transport equipment reflects social interactions between households in the local Shura and households with similar characteristics are likely to adopt similar livelihood strategies. A household in a Shura where farmers have greater access to ICT and transportation is more likely to own such assets. However, the fact that other farmers own or have access to ICT and/or transport equipment will not in itself affect the decision of an individual farmer to participate in agricultural markets.

The CF approach regresses the endogenous variable on the instruments (IVs) in the reduced form and generalized residuals are used as an independent variable in the structural model in addition to the endogenous variables (Petrin and Train 2010; Wooldridge 2015). Although the CF approach is less robust than IV estimation for linear models, it produces more efficient results for complicated models involving nonlinear equations for the endogenous variable and can be more efficient for weak instruments (Tadesse and Bahiigwa 2015; Wooldridge 2007). This estimation strategy has been applied by Winter-Nelson and Temu (2005); Ricker-Gilbert et al. (2011); Liverpool-Tasie (2014); Tadesse and Bahiigwa (2015); and Ragasa and Mazunda (2018). The first stage of the CF estimates the reduced form Eq. (17) using a Probit model by regressing the binary endogenous variables on controls and instruments:

$$Pr\left( {T_{it} = 1{}M} \right) = \alpha + Z_{it} \gamma + M_{it} \varphi + v_{{\text{i}}}$$
(17)

where \({T}_{it}\) represents the endogenous variables (ownership of ICT or transport equipment = 1, 0 otherwise) approximating transaction costs for household \(i\) at time \(t\), \({M}_{it}\) represents the vector of explanatory variables that affect \({T}_{it}\), \({Z}_{it}\) represents IVs that are not included as explanatory variables in the structural model, and \({v}_{\text{i}}\) represents the error term with normal Probit distribution \(N\left(0, 1\right).\) Following Wooldridge (2015); the generalized residuals from (17) can be obtained as:

$$\widehat{{gr_{i} }} = T_{i} \lambda \left( {Z_{i} \gamma } \right) - \left( {1 - T_{i} } \right)\lambda \left( { - Z_{i} \gamma } \right) \; \; i = 1,2,3, \ldots ,N$$
(18)

where \(\widehat{{gr}_{i}}\) is the generalized residual and \(\lambda =\phi (.)/\Phi (.)\) is the inverse mills ratio. In the second step of the CF approach, we include \(\widehat{g{r}_{i}}\) as additional regressors in the structural models estimated by the LDH. The decision to participate is governed by the Probit in hurdle 1, and the extent of expenditure is estimated by the truncated model. The LDH model in the second hurdle assumes that \(\text{log}(y)\) follows a normal distribution for \(y>0\). Following the general form of the LDH model by Wooldridge (2010), our empirical model takes the following form:

$$Pr\left( {y_{1it}^{*} = 1{|}x_{it} } \right) = \alpha_{i} + \delta \widehat{gr}_{i} + \theta T_{it} + \beta X_{1it} + W_{t} + D_{j} + u_{1i} {\text{ Participation}}$$
(19)
$$y_{2it}^{*} = {\text{exp}}\left( {\alpha_{i} + \theta T_{it} + \beta X_{2it} + W_{t} + D_{j} + u_{2i} } \right) {\text{Extent of expenditures}}$$
(20)
$$y_{{it}} = \left\{ {\begin{array}{*{20}l} {{\text{exp}}\left( {\alpha _{i} + \theta T_{{it}} + \beta X_{{2it}} + W_{t} + D_{j} + u_{{2i}} } \right)} \hfill & {{\text{if}}y_{{1it}}^{*} > 0\;{\text{and}}\;y_{{2it}}^{*} > 0} \hfill \\ 0 \hfill & {{\text{otherwise}}} \hfill \\ \end{array} } \right.$$

where \({y}_{1it}^{*}\) and \({y}_{2it}^{*}\) are latent variables denoting participation in the input market (participation = 1, and 0 otherwise) and expenditures on inputs purchased (fertilizers, seed, and tractor rental), respectively, \({y}_{it}\) represents the observed dependent variable (expenditure on \({i}^{th}\) input by the household), \({X}_{it}\), \({D}_{j}\), and \({W}_{t}\) are as defined under Eq. (16), and \({\varepsilon }_{it}\) is the error term. If the coefficient on (\(\widehat{gr}\)) is significantly different from zero in the structural model, then ownership of ICT and transport equipment are endogenous in a farmer’s decision to purchase inputs. Participation and extent of expenditure are assumed to be independent of each other (Hsu and Liu 2008; Wooldridge 2010).

Data and description of variables

This study uses three waves of repeated cross-sectional data from the Afghanistan living condition survey (ALCS) conducted by the Central Statistics Organization (CSO) in 2011/12, 2013/14, and 2016/17. The surveys include both quantitative and in-depth qualitative information on several key indicators including farming and livestock production. Each survey covers the country with 35 strata for the 34 provinces and the nomadic (Kuchi) population. Each survey covered roughly 157,262 individuals in 20,786 households; the pooled sample for the three waves covers over 61,000 households, but only about half reported any engagement in farming reducing the analytical sample to about 30,000 households. Accounting for missing values on key variables, the usable sample is 21,160 households. Summary statistics are presented in Table 1 (the full list of variables is in Table A1 in Appendix). Total labour is in person-days over the last month directly collected in the survey. While some households may have been interviewed in more than one survey, they cannot be identified.

Table 1 Summary statistics of the key variables used in the analysis

Roughly 2/3 of the farmers in the pooled simple purchase fertilizers and chemicals, whereas about 56% hire a tractor for ploughing or other farming activities (Table 1). A lower percentage (about 25%) hire labour, perhaps because demand is seasonal (hiring increases during peak seasons such as planting or harvest seasons). These figures are similar to other countries: Liverpool-Tasie (2014) reported about 70% participation in fertilizer markets in the Kano state of Nigeria; Dillon and Barrett (2017) reported that on average 30% of households hire labour in Ethiopia, 40% in Malawi, 49% in Niger, 30% in Tanzania, and 45% in Uganda. Overall, average expenditures on inputs are higher for the recent survey year indicating higher application of inputs or it could simply be due to inflation. The econometric estimation controls for time fixed effects by including dummies for the survey years to allow for unobserved time-variant factors.

Results and discussion

The econometric results for the hypothesis of separation are presented in Sect. "Testing the hypothesis of separation", and results for market participation in Sect. "Transaction costs and market participation".

Testing the hypothesis of separation

Before formally testing separability, kernel-weighted regressions are used to illustrate patterns and the linear relationship between household labour demand and labour endowments. Figure 1 depicts local polynomial regressions of household labour demand categorized by labour type (household own labour, hired labour, and total labour) in relation to household labour endowments (household size). Using the default Kernel distribution (Epanechikov) and 95% confidence interval bands, the smooth polynomial trend in Fig. 1a indicates a significant increase in household own labour employed on the farm as household size increases, suggesting that larger households contribute more labour to farm work.

Fig. 1
figure 1

Local polynomial regression of a HH own, b hired, & c total labour on HH size. Source: Author’s construction from the ALCS data

While labour demand showed greater variability when household size exceeded 20 persons, for clarity the top of the sample is trimmed because there were too few households larger than 20. Labour hiring also increases with household size, as shown in Fig. 1b, albeit at a lower level and pace than household own labour. Figure 1c illustrates total labour demand (combining own and hired labour) and confirms that the overall demand for labour increases in household size.

If the hypothesis of separation holds, we may not observe a distinct relationship between total labour demand and household size (Dillon and Barrett 2017). Although this relationship does not constitute a formal rejection of the separation hypothesis, because underlying results are not conditioned on other covariates, it does reveal a marked pattern and a strong linear connection between household endowments and labour utilization on the farm.

Table 2 presents the results of the classic Benjamin (1992) test for core variables using OLS estimation of Eq. (16); detailed results including all variables are available in Table A2 in Appendix. The first column presents the simplest specification, including household preference shifters like household size, composition, and landholding. Columns 2 and 3 include additional covariates like transaction costs, socio-demographic factors, and interaction terms, which are crucial for understanding labour demand within the separation hypothesis framework. All regressions control for district fixed effects (FE), and standard errors are clustered at the district level.

Table 2 Regression results for testing the hypothesis of separability

The estimated elasticity of household size (adults aged 14 and older) in relation to total labour days is statistically significant at the 1% level, suggesting a rejection of the separation hypothesis which assumes that household production and consumption decisions are independent. This elasticity also hints at potential market failures that create dependency on household labour endowments, although it is not a specific test for labour market failures (Dillon and Barrett 2017). The combination of age-sex group shares and household size accounts for composition and scale effects. The test statistic, F (4, 385) = 132.83, with a probability of zero (shown at the bottom of Table 2), rejects the null hypothesis that household size and composition variables are jointly zero, confirming rejection of the separation hypothesis.

Columns 2 and 3 include additional controls, independently and interacted with household size, and the null hypothesis of separation is rejected in all three cases. Results for controls are in Appendix Table A2. Ownership of ICT and transport equipment and distance to the road have significant effects on total labour days employed on the farm. The insignificance of the interaction terms implies that rejection of the separation hypothesis is not driven by these variables. The results align with previous studies using the same approach (Dillon and Barrett 2017; LaFave and Thomas 2016; Grimard 2000) for countries including Ethiopia, Malawi, Niger, Tanzania, Uganda, Côte d’Ivoire and Java. A variant estimated using the LoM instruments and 2SLS to correct for potential endogeneity in ICT and transport equipment (column 4) also rejects the separation hypothesis.

Transaction costs and market participation

Appendix Table A3 presents results of the reduced form regression based on Eq. (17) to save space to discuss the main results from the structural models in Eqs. (19) and (20) estimated using LDH. The IVs exhibit statistically significant correlations with the endogenous variables. As the reduced form Probit is nonlinear, traditional methods for testing the strength of IVs may not apply, so studies rely on the partial correlation between the IVs and the endogenous variables in the reduced form estimation (Ricker-Gilbert et al. 2011; Liverpool-Tasie 2014; Amankwah et al. 2016). The instrumental variables are significant at the 1% level in the first stage (Table A3), indicating a strong partial correlation with the endogenous variables. Following Di Falco et al. (2011), the admissibility of the exclusion restriction is indirectly tested by performing a simple falsification test, noting that a variable can be used as a valid instrument if it affects the treatment variables in the first hurdle of the LDH model (19), but does not affect the outcome variable in the second stage (20). The falsification test is conducted by regressing the treatment and outcome variables on the instruments. Results reported in Table A4 confirm the validity of the instruments in all models, with the instruments significantly affecting the treatment variables in the first stage but not the outcome variables.

Results of the structural equations estimated by the LDH model, with a Probit of the decision to participate in input markets and truncated regression for the extent of expenditures, are reported in Table 3 (for the core variables only, detailed results are reported in Table A5). Endogeneity is detected if the generalized residuals are statistically significant in the structural regressions. The generalized residuals for ICT are significant in the first-stage regression for fertilizer and tractor rental markets, and the generalized residual for the ownership of transport equipment is significant in the first stage for fertilizers and seeds, confirming the presence of endogeneity.

Table 3 Marginal effects from the LDH of household’s participation and intensive margin of expenditures conditional on participation

Results show that ownership of ICT and transport equipment significantly increase the likelihood of household participation in input markets. Ownership of ICT increases the likelihood in fertilizer and tractor rental markets by 27 and 15 percentage points, respectively, suggesting that access to ICT equipment helps reduce fixed transaction (search and information) costs and facilitate entry in markets. Similar findings are observed by Randela et al. (2008), who concluded that the more information on marketing available to households, the lower are the transaction costs associated with marketing decisions, resulting in a higher rate of market participation. Chowdhury (2006) finds a strong connection between household use of mobile phones and marketing decisions and suggests that a reduction in information cost through access to ICT equipment may improve market functionality. Alene et al. (2008) and Ouma et al. (2010) found that access to communication assets have positive but insignificant effects on market participation in Kenya and Central Africa (Rwanda and Burundi) arguing that communication assets are perhaps less useful in facilitating transactions if there is no viable market information service. Tadesse and Bahiigwa (2015) find mixed results: while ownership of mobile phones may be useful for some farmers, in other areas, mobile phones do not appear to be an important channel for accessing price information.

Although ownership of transport equipment is not a significant determinant in household decisions to participate in fertilizer markets, the likelihood of hiring a tractor and participating in seed markets significantly increases with ownership of transport equipment by about 3.1 and 13.6 percentage points, respectively (Table 3). Access to or ownership of transportation equipment may allow rural households to transport inputs at lower cost. The findings agree with Alene et al. (2008) for Kenya, who concluded that ownership of transportation equipment increases market participation.

While the distance to roads does not have a significant impact on participation decisions for seed and fertilizers, it significantly reduces the likelihood of households hiring a tractor (tractors are typically hired locally from within the community). The majority of the households may reuse their own seeds from the previous season or obtain seeds locally from neighbouring farmers. Distance to roads is not significant in the second stage for extent of expenditures, except for fertilizer (households spend about 3.6% less when access to road increases by one per cent). The variable measuring the time to reach the nearest market is insignificant in both stages for all inputs. Controlling for district fixed effects may render district-level variables, such as time taken to reach markets, insignificant, especially since most local markets are located in the district centres. Previous studies assessing farm household marketing decisions established a negative relationship between poor road or market accessibility and market participation (Ricker-Gilbert et al. 2011; Winter-Nelson and Temu 2005).

Standard variables, such as household characteristics, have a significant influence on market participation; farm size and crop diversification (the inverse Herfindahl–Hirschman Index, ranging from 0 to 1 with zero indicating no and one indicating full diversification) show a positive and significant association. Diversified farms often cultivate high-value cash crops, produced for market (instead of subsistence) to overcome liquidity constraints.

Household size, literacy, and ownership of farming assets (such as tractors, oxen, and the number of livestock) are significant determinants of market participation and extent of expenditures. Ownership of tractors and oxen increased participation in fertilizer and seed markets but had a negative effect on hiring tractors (they can rely on their own assets). Characteristics related to the type of land and landscape are significant (farms with better land participate more) but price shocks have a negative impact on participation for some inputs.

Robustness of results and model specification

Additional specification tests were conducted to ensure the statistical model best fits the data. The most widely used statistical models for censored data are double hurdle (DH) and standard Tobit models. Since Tobit is nested in DH, I first tested Cragg type double hurdle truncated normal model against standard Tobit using a log-likelihood ratio (LR) test. The computed statistics of the LR test for each of the three models analysing fertilizer and chemicals, tractor rental, and labour hire reject the null hypothesis at 1% significance, indicating that DH is strictly preferred to the restricted Tobit model (Table B1 in Appendix B).

Because the Crag’s type truncated normal and lognormal double hurdle (LDH) models are non-nested models (Hsu and Liu 2008), I use the Vuong (1989) test for non-nested models to test whether the Crag’s type double hurdle model, the LDH or a Heckman sample selection model provides a closer representation of the data. Vuong’s test is a likelihood-ratio-based test that compares non-nested models in terms of the difference in their respective Kullback–Leibler (KL) distance from the (unknown) “true" model. Suppose the KL distance between two competing models (A and B) is given by \(\text{LR}\left(A, B\right)=\text{Log }L\left(A\right)-\text{Log }L(B)\), the test statistic can be calculated as:

$${\text{LZ}} = \frac{{{\text{LR}}\left( {A,B} \right)}}{\sqrt n \omega }$$

where ω denotes the variance of pointwise log-likelihood ratio and \(n\) is the sample size. Large positive (negative) values of the computed test statistics are taken as evidence in favour of model A (note that model A is the LDH). The null hypothesis states that there is no difference between the two models. After implementing the Vuong closeness tests for non-nested models, the results reveal (Table B1 in Appendix B) that LDH is preferred to the Crag’s type double hurdle and Heckman sample selection models in all cases.

I conducted further supplementary tests to assess the robustness of the main results presented in Table 3 and re-estimated the base models analysing market participation specified in Eqs. (19) and (20) using the Heckman sample selection model (Heckman 1979). The Heckman selection model is particularly useful for correcting potential selectivity bias in the sample, especially in the first stage of analysing market participation. The results from this model, reported in Table B2 in Appendix B, are qualitatively and quantitatively similar to the main results shown in Table 3.

Conclusion

Afghanistan’s predominantly rural population relies on agriculture for livelihoods, but productivity is low and farmers face significant constraints limiting access to markets and services. Understanding factors to increase market participation can help improve agricultural productivity, increase incomes, and reduce rural poverty. By analysing market participation, policymakers can identify infrastructure gaps and prioritize investments to enhance access and connectivity, facilitating greater market engagement for rural households. This paper presents empirical evidence using nationally representative household data from Afghanistan on two important questions: are rural markets incomplete or failing? Do transaction costs affect household decisions to participate in factor markets? Pooled cross-sectional data over 20,000 households are used to test the separation hypothesis, and results imply the existence of incomplete or missing markets. A control function approach with instrumental variables is used to estimate the extensive margin decision for input market participation and to examine the intensive margin decision on the extent of participation (expenditure on inputs conditional on participation). The regressions pass several tests, including for the validity of the instruments to address endogeneity bias in the treatment variables (ownership of ICT and transport equipment).

The results reject the null hypothesis of separation; the assumption that production and consumption decisions are independent is not consistent with the data. Specifically, the analysis shows that the demand for farm labour is affected by household composition, suggesting that at least two markets are missing or incomplete. In the context of Afghanistan, factors including inadequate infrastructure and institutional support, restricted access to market information, and high transaction costs could generate market distortions.

Household input market participation decisions are affected by transaction costs associated with search, obtaining market information and transporting inputs, approximated by household ownership or access to communication and transportation equipment. Households with access to ICT and transport equipment are more likely to participate in input markets; owning such assets reduces transaction costs associated with obtaining market information and provides cheaper means of transportation. Access or proximity to roads also affect decisions, with higher participation for households with better access to roads and local markets. The results show that reduced transaction costs increase the likelihood of input market participation by farm households to participate in the market.

Policies to increase input market participation could focus on access to information on inputs and market prices, ensuring that farmers have access to required communication technology and services. The design of interventions to tackle failing markets will depend on the type and existence of legal barriers, non-competitive prices, and high transaction costs. Instruments to target missing markets may involve removal of legal restrictions, whereas investment in public infrastructure and market integration will help address incomplete markets. Strategies that encourage collective action through production and marketing cooperatives and farmer organizations may enable farmers, particularly resource-poor farmers with limited access to markets, to share resources and information to reduce transaction costs.

Limitations and further research

A pertinent area for future research is to collect market data to permit investigating which market failures cause the separation hypothesis to break down. Recent work by Dillon et al. (2019) addresses this and provides one perspective on why the separation hypothesis may be failing. Although results are not conclusive on the exact causes of market failures, they suggest that structural barriers such as financial intermediation, uncertain and costly contract enforcement, and inadequate physical infrastructure leading to high transaction costs are important factors generating distortions responsible for market failures.