Hitting the target and missing the point? On the risks of measuring women’s empowerment in agricultural development

There is a strong impetus in international agricultural development to close ‘gender gaps’ in agricultural productivity. The goal of empowering women is often framed as the solution to closing these gaps, stimulating the proliferation of new indicators and instruments for the targeting, measurement, and tracking of programmatic goals in research for agricultural development. Despite these advances, current measurements and indices remain too simplified in terms of unit and scope of analysis, as well as being fundamentally flawed in how they aim to capture the relevance of ‘gender’ in diverse local contexts. We propose that the impulse to apply exogenously defined and weakly validated ‘women’s empowerment’ measures to diverse local contexts risks prioritizing practical expedience over scientific accuracy and societal relevance. Furthermore, the application of such measures risks creating the impression that programmatic “gender targets” are being achieved, while simultaneously undermining substantive gender transformative goals. The authors conclude that a different methodological approach grounded in participatory and qualitative methods is needed to create more meaningful metrics for assessing progress towards women’s empowerment.


Introduction
"When assessment practices measure what is methodologically convenient and politically instrumental rather than what is scientifically appropriate, they inevitably generate incomplete or skewed findings. The exclusion of complex or difficult data, in turn, tends to support policy pathways that are ignorant of relevant information" (Crane et al. 2016, p. 674).
There is a strong impetus in international agricultural development to close 'gender gaps' in productivity, presumed to be caused by differences between cisgender men and women 1 in asset ownership, decision-making power, and access to resources (Lee et al. 2015). The solution to closing these gaps is often framed as empowering women. The gender gaps discourse is underpinned by the notion that if women's level of empowerment were improved, on-farm productivity would increase. It is worth noting that within this framing, improving women's positions is primarily valued for its presumed effect on agricultural productivity, not as its own worthwhile objective. Within this productivist framing, the search for the right indicators to measure women's empowerment has become a powerful driver in applied gender research (Rao 2016). The assumptions that project monitoring must be quantitative, goal-oriented, and time bound-together with the complexity of gender relations themselves-make it difficult to select appropriate methods, approaches, and indicators for gender monitoring (Mkenda-Mugittu 2003).

3
Furthermore, given that funders of research for development are preoccupied with estimating cost-benefit ratios and quantitatively measuring impacts of their investments (Merry 2011), these circumstances combine to drive the increasing use of gender survey indices and indicators in much contemporary agricultural development circles (see Alkire et al. 2013;Yount et al. 2019;Cole et al. 2020). However, the choice of indicators relies heavily on what is easily observable and measurable (e.g., number of acres cultivated, number of women/men/youths in trainings). Such measures are designed to create programmatic targets-instrumentally defined by preselected outcome indicators-but they risk missing the point on fundamental dimensions of what empowerment and gender 2 relations mean in diverse cultural contexts, and thus on what genderoriented development interventions intend to achieve.
This mismatch between the phenomena of interest and the measurement thereof can have substantial implications. No matter how well intended, project assessments, and thus evaluations of "success", are driven by achievement of preselected and easily measurable targets. We propose that much gender research for agricultural development is measuring what is convenient and instrumental rather than what is appropriate. Not only does this reflect disconnections between actual outcomes of interest and their measurement, but it also reflects substantial disconnections between contemporary theoretical conceptualizations of gender (Parpart et al. 2003;Kiguwa 2019;Uchendu et al. 2019;Hoffelmeyer 2021) and how they are brought into practice of measuring women's empowerment.
Although gender has become increasingly mainstreamed within agricultural research, 3 many challenges remain in linking it with development outcomes (Kantor et al. 2015). In trying to achieve gender development impact, programs have relied on 'hitting gender targets' to assess progress (Collins 2018). This process has pushed gender research in directions that largely side-step issues of gendered power relations and intersectional aspects of cultural norms and institutions. Instead of developing nuanced new methodologies to capture these dynamics, much of the research for agricultural development relies on gender surveys that ignore the centrality of gendered power relations in agriculture (Leder and Sachs 2019).
This simplistic view of gender as a binary 4 and homogenous variable impacts how empowerment is defined. An outcome of this is that intervention actions towards increasing women's empowerment are also overly simplistic, frequently ignoring the aspirations and framings of empowerment among different women within the local socio-agricultural context. By ignoring these dynamics, barriers to women's empowerment are instrumentalized as a technical problem that can be overcome to meet technical agricultural targets. This glosses over the variability and complexity of empowerment as a locally constructed phenomenon. While most gender researchers are aware of and sensitive to these shortcomings, the substantial gaps between theoretical understandings and programmatic practices within gender research for development raise the question, "Are we hitting the target and missing the point?".
We believe that measuring empowerment among women and gender-diverse groups is important. However, we also believe much of the current practices of measuring empowerment are based on pre-conceived exogenous notions of what constitutes 'empowerment', when it is a locally constructed, endogenous concept. By endogenous, we refer to notions of women's empowerment that are 'derived internally'-locally defined and relevant to women's lived realities. This is in contrast to what we are critiquing-exogenous notions of women's empowerment-that are externally defined by outside development actors and assume universal relevance to women's lives. As described by Garba (1999, p. 31) endogenous empowerment is a bottom-up dynamic process, in contrast to exogenous that is top down-while external actors can facilitate empowerment, the notion of empowerment itself must be locally defined. Thus, independent of the metric used, if the meanings of 'empowerment' are not interpreted at the local level (though qualitative analysis of how the meaning is locally constructed and derived from embedded gender norms, power relations, and how other intersectional social axes interact with gender), research for development practitioners are at risk of missing the point on what constitutes 'empowerment'. This is our central line of argumentation-that because the concept of empowerment is locally constructed, metrics can only be devised at the local level to inform interventions and create monitoring/ tracking indicators that are locally relevant.
To address these issues, our perspective piece begins by laying out key critiques in the conceptualization and measurement of gender and women's empowerment in agricultural development. Following on this, we explore the existing 'solutions' that have been used to address these critiques and end with a call to action for the field of gender research in agriculture for development.

New and not-so-new critiques
The use of standardized and normative survey instruments to understand complex social dynamics have long been heavily critiqued (Gill 1993;Mayoux and Chambers 2005;Merry 2011). Drilling down specifically into critiques of the conceptualization and measurement of 'gender' and 'women's empowerment' in agricultural development, we outline several key issues.

What is being measured? The illusion of universal notions of empowerment
Many gender development targets, and thus measurements, are founded on exogenous and universalistic notions of gender equity and women's empowerment. This movement towards a universal ideal ignores cultural variability and local people's agency in defining empowerment, which ironically goes against much feminist theory (Mohanty 2003;Basu 2018;Sachs 2018). While donors and development practitioners often have specific notions of what constitutes equity, smallholders do too (Tavenner and Crane 2019b). For example, women may not perceive livestock as an "asset" when it comes with untenable labor and financial obligations, contradicting the index assumption that livestock are a categorically desirable asset that leads to or signifies empowerment (Leder and Sachs 2019). Relatedly, not considering women's preferences for certain livestock species (e.g., chickens and goats) over others undercuts attempts at measuring women's empowerment based on aggregated total livestock units (TLU) owned (Chanamuto and Hall 2015). Another common indicator of gender equity-women's access to joint or individual bank accounts-ignores gendered norms around masculinized household headship responsibilities, intrahousehold finances and resources associated with masculinized commodities .
The choices and operationalizations of specific indicators are also a problematic area in efforts to measure women's empowerment. Individual decision-making power and asset ownership are often used as universal indicators of empowerment, yet generally do not consider empowerment as being subject to cultural norms relating to families as co-operative units or recognize the areas of jointness, negotiation, and complementary responsibilities at the intra-household level. While several recent publications (Doss and Meinzen-Dick 2015;Johnson et al. 2016;Stoian et al. 2018) have recognized the importance of highlighting these dynamics, their analyses fall short of accounting for the localized meanings and cultural valuation of empowerment ascribed to these metrics. Thus, an exclusive focus on economic empowerment inevitably leads to over-simplification of localized gender relations (Eyben and Napier-Moore 2009;Pereznieto and Taylor 2014;Cornwall and Rivas 2015;Bayissa et al. 2017). Similarly, the instrumentalization of 'gender' in development interventions that equate technology adoption with women's empowerment is also deeply flawed . Within programmatic targeting and tracking, this is likely to result in indicators that may not be particularly meaningful to rural women for assessing progress towards their ostensible 'empowerment.' While there have been promising new approaches (Johnson et al. 2018;Quisumbing et al. 2019) towards clarifying agricultural intervention objectives and goals towards women's empowerment (notably the Reach-Benefit-Empower-Transform (RBET) framework), because the meanings associated with women's empowerment are predominately defined through an exogenous lens (e.g., "increased decision-making power/ assets" equates with "increased levels of women's empowerment"), the universalist assumption that individualized decision-making is desirable above other modes of collective and community decision-making remains problematic. In sum, the local and culturally contextual meaning of what it means to be empowered cannot be assumed. We assert that measuring gender gaps in agricultural productivity or land ownership or assets is appropriate and needed to inform agricultural interventions, but the assumption that the higher achievement of these things materialize to enhance empowerment within a universalist frame is not.

Problematizing binary 'sex' in women's empowerment indices
The reduction of women and men to dualistic 'sex' 5 categories in sex-disaggregated empowerment metrics does not capture the full spectrum of biological sex, nor does it engage with how 'women' are not a homogenous group within agricultural systems. Using binary sex categories thus risks misrepresenting women (and men) as seemingly monolithic categories, rendering important diversity within and between these groups invisible, and excluding nonbinary, intersex, and gender nonconforming people from being measured at all. Using a binary framing misses out on how gender and notions of empowerment are defined in relation to historical and culturally contingent intersections with other axes of social differentiation, such as age, ethnicity, marital status, wealth, caste/class, etc. (Anthias 2012). As a conceptual lens, intersectionality supplants gender as a monolithic category with an understanding of how numerous social variables interact to affect individuals' diverse subjective and social experiences of opportunities and constraints (McCall 2005). As such, intersectional analysis shows how different axes of experience and identity interact to produce different effects that cannot be captured by analyzing single categories or their linear interaction (Clement et al. 2019;Tavenner and Crane 2019a). In other words, although there may be some broad generalizations that can be made about gendered norms and roles, treating gender as a coherent, standalone social category glosses over important empirical nuances that are relevant to programmatic objectives in agricultural development.

Ignoring how localized gender power relations in agricultural systems shape the meaning of empowerment
While many women's empowerment metrics perpetuates the notion that gender is merely a property of atomized individuals, who are independent of other actors, contemporary gender theory sees gender as relational (Hackfort and Burchardt 2018;Jerneck 2018;Risman et al. 2018;Leslie et al. 2019). Gender gains meaning through social practices that structure hierarchies of gendered power in given contexts (Tavenner et al. 2020). This hierarchy is shaped by localized notions of socially desirable/acceptable masculinities and femininities (Schippers 2007) and sex disaggregated survey data does not capture how gender is mediated by societal relations and structures. Furthermore, it often does not account for how men and women report on key variables. Several recent studies have found conflicting accounts between spouses and within households on reporting gender gaps in asset ownership, decision-making power, and access to resources, highlighting the complexity of intrahousehold gender relations (Deere and Catanzarite 2014;Anderson et al. 2017;. These inconsistencies in reporting suggest that women and men either have different cognitive perceptions or strategic representations of what constitutes these domains. Inasmuch as measures of empowerment rely on reported data from surveys, even sex-disaggregated surveys, they will always be subject to gender respondent bias. Institutional aspects of gendered ideologies play significant roles in shaping women's engagement in agriculture, such as in agricultural market participation and valuation. Local gendered ideologies that ascribe 'masculine' or 'feminine' meaning to certain crop/livestock species pattern men's and women's preferences and practices (Hovorka 2012), and consequently, their notions of gender equity and empowerment in agricultural development. Because markets and commodities exist within socially embedded gendered institutions , eschewing analysis of gender power in agricultural markets, institutions, and commodities effectively instrumentalizes gender as a technical and individual problem. Consequently, much targeting measures what is convenient rather than what's important, e.g., the proxy of 'women's participation in meetings' to signify their empowerment or 'gender success'. This link between tokenistic participation and artificial empowerment is wellestablished in the gender and resources literature (Agarwal 2001;Cornwall 2003).
A sole focus on "getting the indicators right" for women's empowerment obscures the complexity of how gender power operates dynamically and relationally within and across households, institutions, and value chains. The authors have had conversations with many gender and development practitioners who agree, in private, that current indices and surveys are inadequate for capturing the essential gender dynamics embedded in local agricultural systems. The predominant recourse has been to collect qualitative data to triangulate or interpret survey findings. Is this sufficient in measuring progress towards the 'gender goals' of enhanced equity and empowerment?

How is empowerment measured? The dead-end of qualitative validation
Many authors have recently critiqued sex-disaggregated surveys' ability to capture relevant gender dynamics (O'Hara and Clement 2018; Galiè and Farnworth 2019; Leder and Sachs 2019; Acosta et al. 2020). However, they fall short of advising how to deconstruct such indicators and indices to generate meaningful data beyond 'contextualization' through supplemental qualitative research. This is a vicious cycle, which we refer to as the "dead-end of qualitative validation." For example, a recent study using small-sample qualitative data challenges the results of the A-WEAI (The Abbreviated Women's Empowerment in Agriculture Index) (Leder and Sachs 2019). Drawing on a sub-sample of the original survey participants, the study shows how relational dynamics of women's empowerment in intra-and inter-household relations were not captured, resulting in the survey misrepresenting marginalized women as 'empowered'. The original findings had direct implications for the women that were being targeted for the next phase of activities by essentially recommending the 'wrong' target population for intervention.
Using qualitative data to triangulate or validate aggregated quantitative indices is thus clearly problematic as a growing number of cases illustrate that qualitative data undermines and illustrates the meaninglessness of aggregated indicators (Lambrecht and Mahrt 2019; Leder and Sachs 2019; Acosta et al. 2020), likely because the quantitative indicators have been insufficiently validated prior to their inclusion in surveys. Because 'gender' and 'women's empowerment' are variably constructed in different cultural settings, creating standardized indicators that cut across geographies is a scientific-and implicitly political-decision (Crane et al. 2016) to gloss over cultural self-determination, which is, ironically, a fundamentally disempowering act on the part of gender researchers and development organizations (Tavenner and Crane 2019b).
To 'fix' these impasses between qualitative and quantitative 'empowerment' data, a recent suggestion is to use 'cognitive interviewing' to validate existing survey questions , by systematically identifying and analyzing sources of response error in surveys. Such information can improve the internal validity and accuracy of survey instruments (Willis and Miller 2011;Yount et al. 2019). Cognitive interviewing can effectively address the question of whether survey creators and respondents understand questions in the same way. However, while it may help improve internal validity, it does not address the ways that value judgements embedded with questions, indicators, and indices affect external validity. Regardless of whether the framing of a question is 'correct,' the deeper issue of what mediates power remains unanswered, and indeed unasked. In other words, the question 'Are they understanding my question in the same way?' falls far short of addressing 'What does power in decision-making mean for women and men in a specific context'.
While cognitive interviewing addresses one source of bias, it is still 'missing the point' when it comes to measuring women's empowerment. The purpose of qualitative data should not be to interpret prescriptive, exogenously defined quantitative indicators. In order to improve external validity, qualitative and participatory data should instead be the starting point for developing locally relevant, endogenously defined quantitative indicators used for tracking progress towards 'gender goals.' Following this approach would enable development practitioners to start from conceptualizations of equity and empowerment that are relevant to rural women themselves. In so doing, it will also force researchers to transparently address the times and places where they choose to impose exogenous conceptualizations of equity and empowerment (Tavenner and Crane 2019b).

A call to action
"What is measured-and not measured-influences discourse and confers legitimacy to certain categories of intervention or institutional change by stressing certain forms of power while rendering others invisible" (O'Hara and Clement 2018, p. 112).
Even acknowledging the successes of decades of progressive 'gender and development' work, 6 it remains troubling this community appears to be returning to a 'women-in-development' approach, whereby exogenous "women's economic empowerment" frameworks are increasingly used as baselines to generate indicators (Jaquette 2017). The gender and development community generally agrees that 'gender' and 'empowerment' objectives are context-specific and embedded in gendered power relations at the local levels (Galiè and Farnworth 2019). However, we argue that this fact has serious methodological implications for how gender equity and women's empowerment are measured, targeted, and tracked; implications which are not accounted for in the current state of the field.
Many studies use qualitative research to help with the interpretation of quantitative survey data. However, reversing the relationship between qualitative and quantitative research will help improve both internal and external validity of empowerment measurement (Mayoux and Chambers 2005). Qualitative research should come first, in order to capture localized understandings of empowerment and gendered relationships in agriculture. This will enable the establishment of indicators that are contextually legitimate, the broader patterns of which can then be established through large-N surveys that use the validated measurements (Khadka and Vacik 2012). There are a few, rare applications of this approach to indicator development (Nazarea et al. 1998;Bollig et al. 2020), though they remain exceptions. For example, borrowing from the field of clinical psychology, Nazarea et al. (1998) used a modified Thematic Apperception Test (TAT) to generate culturally relevant indicators of sustainability and quality of life amongst indigenous communities in the Philippines. Using photographs from the Manupali watershed in Bukidnon, the authors first gathered oral history narratives from participants of different genders, ethnicities, and ages to identify whether different groups of people perceived the environmental-agricultural landscape differently. These narratives were then coded by a multi-cultural and multi-disciplinary team of researchers based on dominant themes pertaining to indigenous systems of sustainability and contextually sensitive definitions of quality of life. A scoring system was developed for each theme based on the frequency by the total number of participants. The researchers found that these locally defined indicators differed significantly from the externally defined indicators for sustainability and quality of life and provided a critical window onto intra-cultural variation (e.g., by gender, ethnicity, and age) in the valuation of natural resources and agricultural practices among indigenous people. In applying this methodology to evaluating women's empowerment, modified TATs could shed light on intra-gender and intersectional differences in conceptualizations of equity and empowerment.
More recently, Bollig et al. (2020) introduced the concept of "ethnographic upscaling" which involves first carrying out in-depth ethnographic work (e.g., open and semistructured interviews, case studies, participant observation, and network analysis) in a few cases to generate hypotheses, and then conducting standardized surveys in a larger number of "spatially continuous cases" to generate comparative material on the basis of which the hypotheses can be tested. Using a case study in communal water management in Namibia, the authors were able to use the insights gained from ethnographic methods (i.e., in-depth understanding of the historical patterns of water management and its embeddedness in local socioeconomic and demographic context and social structures, cultural beliefs, and conflicts) to identify a main set of indicators shaping water management that could be operationalized in a structured questionnaire used for "upscaling." The authors concluded that the methodology was a means to provide "highly contextualized and valid case studies that provide processual understanding of complex cases, the entanglement of different spheres of social and cultural life, and cross-scale implications" (p. 13). In applying the methodology to assessing women's empowerment, ethnographic upscaling would facilitate exploring whether the locally validated results from cases studied indepth could be extrapolated to other cases within a region and whether broader patterns can be established.
Taken together, these examples show that the development of contextually valid indicators will certainly be slower and less easily comparable than the broad application of generic survey instruments. However, it will also render higher quality data that might even enable rural women of the world to define their own ideas of empowerment rather than having it imposed upon them by global gender research institutions.
There have been some recent developments challenging exogenous framings of empowerment in survey design by paying greater attention to the importance of local women's and men's own framings of empowerment. For example, the Women's Agency Scale 61 measures intrinsic, instrumental, and collective agency (Yount et al. 2020). A survey tool developed by Maiorano et al. (2021) accounts for choices, values, and norms in scoring empowerment in a comparative index. Finally, the Relative Autonomy Index (Seymour and Peterman 2018) accounts for whether certain choices (or the decision not to make certain choices) are intrinsically or extrinsically motivated. While these innovations begin to address some of the methodological implications of our present critique, each iteration would need to be grounded in participatory and qualitative methods to create more meaningful and appropriate metrics for assessing progress towards women's empowerment.
There are some promising ways forward to this end, in both application and conceptually. Jayachandran et al. (2021) propose a methodology that combines machine learning with qualitative interviews to design a five-question women's agency index. Glennerster et al. (2018) outline a four-step methodology for developing locally tailored empowerment indicators using findings from formative qualitative research. Recently, Muchtar et al. (2019) drew on women's stories of empowerment in an Indonesian coastal management project to develop a new conceptual framework that reflected the personal, relational, and multidimensional aspects of empowerment. These types of approaches based on participatory and qualitative methods to localize understanding of gender-related norms, perceptions, relations, and structures have the potential to be used in the creation of locally meaningful measurements of empowerment.
Promising conceptual frameworks to assist in generating appropriate tools for measuring empowerment based on contemporary theoretical understandings are also emergent. Gender transformative approaches that address gendered practices and structures, allow for interrogation of norms and common concepts, including women's empowerment (Burg 2019). However, "these approaches risk encountering much resistance and consequently to be frustrated, marginalized or even stopped" (Burg 2019, p. 51). These risks are derivative from the practical challenges of grappling with powerful structural and institutional barriers that reinforce gendered inequalities in agricultural systems. While gender researchers in agricultural development commonly encounter these barriers in their analyses, they often have little power to address them directly. Thus, even in interpreting empowerment in ways that address, "the subtleties of the relations between women and men, the underlying values that define these relations, and the meanings attached to these roles and benefits" (Njuki et al. 2016, p. 286), moving these findings beyond the analysis stage to get institutional buy-in is often difficult. However, new tools that identify the 'leverage points' (Manlosa et al. 2019) in-between institutional changes in targeting gender gaps and deep social norms/attitudes could provide an important framework towards finding new opportunities for gender equality in inequitable agricultural systems. Similarly, the RBET framework (Johnson et al. 2018;Quisumbing et al. 2019) which aims to help interventions clarify their objectives (i.e., whether a project seeks to reach, benefit, or empower women, and/or transform gender relations), can ensure more accurate metrics towards empowerment and gender targeting. RBET is a promising conceptual pathway forward-so long as the endogenous framings of empowerment are captured amongst the groups targeted.
In better understanding and addressing how gender inequalities intersect with other axes of social differentiation in agricultural interventions (Malapit et al. 2020), practitioners will need to broaden their conventional view of gender away from a sole focus on homogenous categories of cisgender women and men. Gender research that reinforces this binary view will continue to exclude transgender, genderqueer, and non-binary people (Leslie et al. 2019). We hope future researchers in agricultural development will continue to move the field towards more socially inclusive methodologies and assessments. For example, the application of queer theory offers a unique pathway forward in understanding how heteronormativity impacts gender dynamics and sexual minorities in agriculture (Wypler 2019).
In a nutshell, we believe that women's empowerment is a valid concept to measure-if, it is assessed (1) first using qualitative methods and analysis to interpret local meanings of empowerment, (2) by creating metrics and tracking changes at the local level with an understanding that such metrics cannot be aggregated at scale, and (3) by acknowledging that a focus on 'binary sex-disaggregation' does not capture the full spectrum of biological sex, nor does it engage with how 'women' are not a homogenous groupand thus risks misrepresenting marginalized groups of women (based on intersectional and relational differences), and excluding nonbinary, intersex, and gender nonconforming people from being measured at all. Thus, collecting and analyzing data on additional axes of social differentiation among women is needed, as well as developing more socially inclusive assessments that account for otherwise 'invisible' groups.
This perspective piece has outlined the glaring disconnects between contemporary theoretical understandings of gender and empowerment, and how they are commonly conceptualized and operationalized in programmatic agricultural development agendas. With this article, we hope to push the field of gender research in agriculture for development about how targeting, measuring, and tracking of 'women's empowerment' can be done better, because if we continue with business as usual, we risk hitting the target, but missing the point. Acknowledgements This work was supported through the Consultative Group on International Agricultural Research (CGIAR) Research Program on Climate Change, Agriculture and Food Security, which is carried out with support from CGIAR Fund Donors and through bilateral funding agreements. For details, please visit https:// ccafs. cgiar. org/ donors. The views expressed in this document cannot be taken to reflect the official opinions of these organizations. Special thanks to Carlos Quiros for his intellectual contributions in discussing the ideas herein.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.