Avenues for improving farming sustainability assessment with upgraded tools, sustainability framing and indicators. A review

The sustainability of agriculture is questioned due to major negative environmental and socioeconomic impacts. To improve the state of agriculture, various management changes have been proposed. Different tools, with varying characteristics, sustainability framing and indicators, have been used to evaluate the impact of these changes on sustainability. Here, we review 119 tools for farm sustainability assessment and compare their use, sustainability dimensions, themes and types of indicators used for biodiversity conservation, farm viability and gender equity. Our main findings are that (1) tools could be classified into five groups that differ in stakeholder participation and model used for calculating sustainability, (2) seven different sustainability framings are used and (3) only two out of 29 indicators screened address impacts of farming system while the others address drivers, pressures or states of the system. The tools were grouped in “Long-term monitoring of farm activities” (11%), “Ex-ante assessment of sustainability with bioeconomic models” (9%), “Survey- and indicator-based assessment of tools” (41%), “Consultation-based assessment” (25%) and “Active engagement of stakeholder-based assessment” (14%). The “classical view of sustainability”, with the economic, social and environmental pillars, was used in 61% of the papers. Based on these findings, we suggest (1) development of temporal dynamic assessment of farm sustainability with active involvement of stakeholders in the framing of sustainability and design of indicators to achieve reliable and relevant assessment outcomes. We recommend (2) adoption of more complex sustainability framings dealing with emerging system properties, namely resilience, viability and stability. In these, (3) governance/institutional dimensions should be emphasised, and social themes targeting farmers’ characteristics should be included. Finally, (4) use of impact indicators in farm sustainability assessments is critical, and they should be designed to contribute to scientifically rigorous and relevant assessments of farming system sustainability.


Introduction
Worldwide, the sustainability of farming is questioned in terms of its social, economic and environmental aspects (Pretty 2018). In some developing countries, farmers are considerably affected by food insecurity and rural poverty. Almost 690 million people continue to live in extreme poverty, and inequalities of all kinds are pervasive (FAO, IFAD, UNICEF, WFP and WHO, 2020). Most of the poor live in rural areas and depend on agriculture for their livelihoods and food security (IFAD, UNED 2013). In parallel, in some developed countries, farming has managed to fulfil the needs of the population in terms of food crops but has had a range of negative external environmental impacts (Foley et al. 2011), along with low economic return for farmers, which has forced some to abandon farming (Benayas et al. 2007). Many existing farming systems are vulnerable to climate change and price variability, particularly in contexts of specialist production systems and low levels of resources to cope with such disturbances, such as water for irrigation or economic capital (Skoufias et al. 2011). Global food production needs to be transformed to respond to the food security and climate challenges, while preventing any transgressions of multiple planetary boundaries globally and regionally (Rockström et al. 2013;Gerten et al. 2020). Major improvements to agricultural systems are possible with more sustainable farming practices and innovations, especially at farm level. The effect of these modifications on the sustainability of farming systems can be assessed by monitoring how they affect the state of the farm and household before and after implementation of change (Gasparatos and Scolobig 2012) (Fig. 1). Many tools have been proposed by the research community to assess farm sustainability in order to support farmers in agricultural management or policy makers by informing them about the anticipated effects of policy implementation "exante" or to assess their consequences after implementation "ex-post". Hansen (1996) defines sustainability as the ability to satisfy goals, a definition that is used by several indicator frameworks for sustainability assessment (e.g. Alkan Olsson et al. 2009;Chopin et al. 2017). These tools adopt a given hierarchical structure where sustainability is broken down into dimensions of the targeted sustainability, which are further broken down into themes, also called sub-dimensions or components that represent the main objectives to be achieved by farming systems. The tools then use a broad range of indicators to capture the impact of the farming system on the multiple dimensions of sustainability.
Tools vary greatly in terms of use and sustainability framing, i.e. in how sustainability is broken down into themes and indicators. Many previous studies have examined specific topics of interest regarding specification of tools used for farm sustainability assessment. The importance of stakeholder involvement has lately been much debated. According to Arulnathan et al. (2020), stakeholder expectations are rarely considered. Binder et al. (2010) found that farm-level sustainability assessment tools are often characterised by rigid sustainability measurements and reporting requirements but with very little stakeholder involvement in the development process. Stakeholder involvement can occur at various stages of the development and application of the tool ). Stakeholders' views on the system assessed can influence the definition of sustainability by targeting specific issues, the type of indicators used and the way in which the sustainability assessment is conducted (Arulnathan et al. 2020). A variety of assessment types (i.e. formats) co-exist, including calculators, protocols and guidelines with differences in scope of the sustainability assessment, indicators, models and input requirements (Denef et al. 2012). They are tied to different functions and objectives ). These assessment types require different types of tool, ranging from comprehensive to rapid tools ) with different visions of sustainability (Arulnathan et al. 2020). The sustainability framing also differs across tools and regions. While the fundamental processes governing the functioning of agricultural systems result in a set of common themes for sustainability assessment, contextual conditions further orientate these themes. Several reviews have addressed this variation in themes for specific farming systems, such as livestock production systems (van der Linden et al. 2020).
Previous reviews also describe the variability of existing sustainability assessment tools. Their descriptions of the diversity of tools differ in terms of (i) the characteristics of frameworks for assessment and their categorisation, (ii) the choice of sustainability themes to target and (iii) the diversity of indicators used. Generally, previous reviews first describe a number of tools (35 in Schader et al. (2014), 10 in Schindler et al. (2015), 4 in Marchand et al. (2014), 19 in Arulnathan et al. (2020)), using characteristics such as scope of the assessment, functions of targeted farming systems or end users of the tool (De Ridder et al. 2007;Ness et al. 2007;Binder et al. 2010;de Mey et al. 2011). From these descriptions, most reviews derive typologies of tools, or make comparisons, focusing on one or two tool characteristics of interest. Such characteristics include scales and participation (Binder et al. 2010), complexity and level of comprehensiveness , indicator aggregation and selection (Wu and Wu 2012;Engert et al. 2016;de Olde et al. 2017;Bonisoli et al. 2018), and aggregation of indicators and valuation systems (Gasparatos and Scolobig 2012;Slätmo et al. 2017). The framework used by TempAg, an international research consortium for sustainable agriculture in temperate regions (Gregory and Kougioumoutzi 2016), describes the relationships between tool characteristics with correlation analysis (Wustenberghs et al. 2016). Assessing the diversity of tool characteristics could help establish the connections between the variables that define the sustainability assessment and reveal ways to improve their use by considering these links.
A number of indicators to measure sustainability have been proposed at farm scale. Examples are metrics for sustainable intensification (Smith et al. 2017) and indicators of economic sustainability (Latruffe et al. 2016). The indicators suggested for a particular type of sustainability theme vary in terms of data requirement, relevance of results for end users and scientific validity (Herrera et al. 2016;Bonisoli et al. 2018;Kelly et al. 2018), but also in their ability to explain causal chains for describing the interactions between farming, the environment and society (EEA 2000). This causal chain is often captured using the DPSIR (drivers, pressures, state, impact, response model of intervention) causal framework, which describes the interactions between society and the environment. The socio-economic and socio-cultural forces are driving human activities, which increase or mitigate pressures on the environment. Pressure is the stress that human activities place on the environment. State, or state of the environment, is the condition of the environment. Impacts are the effects of practices on the state of the environment. Responses refer to the responses by society to the environmental situation (European Environment Agency (EEA) 2000).
Despite the extensive literature on sustainability assessment at farm level, no systematic review of sustainability assessment tools that includes tool descriptions with a range of variables and statistical classification has been performed. Existing reviews focus on a limited number of tools that have gained popularity among researchers and that have been used in multiple contexts, e.g. the IDEA tool in France presented in Zahm et al. (2008) and used in various contexts such as in Morocco (Baccar et al. 2018). However, the focus on these popular tools may have caused more local initiatives in tool development that reflect certain perspectives on sustainability assessments to be overlooked. Although they are not widespread in the research community, such initiatives can offer insights on promising avenues for tool development. Describing sustainability assessment in a large body of literature can be expected to reveal links between characteristics and groups of approaches with their strengths and current limitations. Another shortcoming is that no publication has investigated the diversity in translation of the definition of sustainability into themes. Tools for assessment may focus on one dimension and underestimate the importance of other sustainability themes that are linked to other dimensions. One study on different translations of sustainability has been made by de Olde et al. (2017), who compared the overlap among four sustainability assessment tools (SAFA, PG, IDEA and RISE).
Broadening this analysis could help identify a larger variety of themes to be addressed in sustainability assessment at farm scale.
We question the precision of commonly used indicators in assessing the impact of farming, as some sustainability themes may address pressures, rather than impacts or the state of systems. This may hamper the ability of the tool to assess the impact on farming systems as the relationship between pressure and impact is not always known (Schader et al. 2014).
The purpose of this study was to identify improvements in the tools used for farm sustainability assessment in terms of characteristics, choice of sustainability framing as a function of sustainability dimensions and themes, and type of indicators. The following specific research questions were considered: 1. How is sustainability assessment conducted at farm level? 2. Which sustainability dimensions and themes are rarely or only superficially addressed despite being key aspects of sustainability of farms? 3. What types of indicators are currently being used to assess the sustainability of farms?
To address these three questions, we (i) compiled a large set of studies for a systematic review of the literature, described the core characteristics of tools and categorised them into groups. This categorisation helped to highlight groups of methods with low intra-group variability in terms of characteristics and to identify connections between tool characteristics. Comparing the characteristics of these groups revealed possible changes to address their weaknesses. We then (ii) identified the sustainability dimensions and themes targeted and (iii) the type of indicators associated with each core sustainability theme, using the DPSIR framework to classify these. We focused on viability for economic sustainability, equity and gender balance for social sustainability, and biodiversity conservation for environmental sustainability. In this paper, we use the word "tool" in a broad sense to include the diversity of approaches used in the literature (referred to as "approach", "method", "tool" or "framework") to assess farm sustainability.

Selection of tools
We selected peer-reviewed papers that assess the sustainability of farming based on empirical observations. To do so, we conducted a literature search by formulating a search equation in the ISI Web of Knowledge. To analyse the references, we used R package wosr (Baker 2018) to extract title, journal name, abstract and various meta-data in an R table, facilitating subsequent data treatment. The equation was divided into four topics, and for each topic, we identified potential synonyms to build our equation (Eq. 1). For the farm topic, the word "farm*" (with an asterisk) including potentially "farming", "farmers" and "farms" was used. We covered "assessment" by using "*assess*" and "eval*" standing for "assessment", "assessing", "evaluate", "evaluation" or "evaluating". To prevent inclusion of papers from outside agriculture, e.g. wind farming for electricity production, we added the keyword "agr*" standing for "agricultural", "agronomy" or "agriculture" among others. Finally, as the focus of the analysis was assessment of sustainability, the word "sustainability" was used. The search was done using the TS field, which allows searches for different topics in title, abstract and keywords (Eq. 1). No time limitation was applied in the search.
The literature search was closed on 2 May 2019, and the results encompassed 2567 papers with a time span from 1990 ( Fig. 2). We scrutinised this list of papers manually to select those that addressed sustainability assessment at farm scale. We applied filters to discard papers that did not use tools (e.g. review or position papers, n = 287). We only targeted studies that performed a sustainability assessment at farm level. We defined a farm as "a decision-making unit, comprising the farm household, cropping and livestock systems, that transforms land, capital (external inputs) and labour, including genetic resources and knowledge, into useful products that can be consumed or sold", following Fresco and Westphal (1988). Using the word farm* and agr* resulted in exclusion of a range of papers where the sustainability analysis was conducted at another spatial scale, namely field level (n = 703), landscape (n = 294), production system (e.g. only milk production on mixed farms, n = 83) or supply-chain level (n = 74). We discarded papers that treated only one sustainability dimension, such as agricultural practice studies, e.g. tillage impact on various environmental issues (n = 200), or when only one state variable was evaluated, such as soil organic matter change (n = 249). A broad range of socioeconomic studies were also discarded, such as econometric analysis of the impact of agricultural policies on farm revenue (n = 352). When updated versions of tools were used in a more recent study, we used them instead of the first version (e.g. IDEA developed by Zahm et al. (2008) was described using information from an updated version from Baccar et al. (2018)). When information was lacking about some variables, we used parallel reports, website or notices. Hence, for the SMART tool operationalising the FAO's Sustainability Assessment of Food and Agriculture (SAFA), we used the corresponding version of the publication (FAO 2014). Of the 119 remaining papers, only 106 were used for describing tools because some case studies did not describe the tool used. All tools described in the 119 papers also described indicators, but not all included themes and dimensions of sustainability. Hence, for each phase of the work, we relied on different numbers of papers: 106 for tool, 104 for indicator description, 96 for dimension listing and 63 for theme listing (Fig. 2).

Description of studies
We described each of the 106 studies used for tool description with categorical variables using the framework proposed by Binder et al. (2010) for agricultural sustainability assessment tools (Table 1). Those authors propose three dimensions: a normative dimension describing the sustainability concept, goal setting and assessment type; a procedural dimension describing the set-up phase, indicator selection, measurement and assessment; and a systemic dimension describing the system representation and indicator interaction. We simplified the framework by removing some variables that could not be determined for a majority of papers. The description of each of the 106 studies was based on 14 categorical variables (2 ordinal, 12 nominal) grouped into the three above-mentioned dimensions.  Nominal Efficiency (10); AHP 1 (5); BEFM 2 (6); Emergy and LCA 3 (6); expertise (6); indicator-based (68); simulation (5) System existence Nominal Ex-ante 4 (5); ex-post 5 (85); both (16) Procedural Tool function Nominal "Strengths and weaknesses (ST)" (46); "Research (R)" (6); "Policy advice (PA)" (11); "ST and R" (7); "ST and PA" (19); "PA and R" (7); "Multi-purpose" (10) Normative dimension The "view on sustainability" of the tool refers to whether the assessment is "goal-oriented" or "meansoriented". Goal-oriented denotes that the system performance must achieve some objectives or result in terms of impacts in order to be considered sustainable. Means-oriented refers to the fact that the assessment focuses on the nature of the system, hypothesising that positive characteristics, such as adoption of conservation techniques, will lead to the desired impacts, e.g. adoption of grass strips leads to a decrease in erosion. "Orientation of the tool" refers to the definition of sustainability as "top-down", which means defined by researchers, or "bottom-up", which means defined by target users, or both when sustainability emerges from a collective definition. "Assessment type" is the kind of tool that can be used for rating the level of sustainability, which we grouped into seven types.
"Efficiency" analysis focuses on use of inputs for production using a frontier of efficiency methods such as data envelopment analysis or stochastic frontier analysis (Mutyasira et al. 2018). "Indicator-based" analysis refers to calculation of a broad range of indicators (Dizdaroglu 2017). "Analytic Hierarchy Process (AHP)" is the use of pairwise comparison of different farms or indicators to evaluate outcomes (Saaty 2008). "Bio-economic farm model (BEFM)" refers to optimisation models that simulate farm sustainability change as political, technical and management change (Janssen and van Ittersum 2007). "Emergy and LCA" were categorised together here and are tools that can be used to evaluate the sustainability of systems based on the amount of energy consumed in direct and indirect transformations to make a product or service. "Simulation" models represent farm functioning by a set of equations linking inputs of the farm (e.g. labour, fertilisers) with its outputs (e.g. yields, income, losses of nutrients) (e.g. Prado et al., 2011). "Expertise" is direct assessment of farm sustainability by a stakeholder, e.g. an advisor. "System existence" refers to whether the tool is applied in an "ex-post" manner on an existing system or "ex-ante", i.e. before the implementation or real-life existence of a system.
Procedural dimension "Tool function" refers to the aim with applying the tool in the assessment. Sustainability assessment tools can be used for several functions, e.g. comparing farms to provide advice to improve the sustainability as "strengths and weaknesses", orientating policy for improving the current state of farms and targeting innovations to improve farms (i.e. "policy advice"), advancing operational development of sustainability assessment tools (i.e. "research") or combinations of these three different functions (Table 1). The tool function is closely related to the target group or end users, i.e. the group of people directly interested in the results of the evaluation, namely farmers/ advisors for strength and weakness analysis, policy-makers for policy advice, and researchers for advanced research on sustainability. "Stakeholder involvement" refers to the level of participation by various stakeholders in terms of defining sustainability, choice of indicators and involvement in the farm sustainability assessment. Different typologies of participation exist (Reed 2008), but we chose to define three levels: "no participation", "consultation" of stakeholders on the choice of indicators and "active involvement" where stakeholders participate in the assessment and reflection on the results. "Procedure for indicator selection" describes how the list of indicators for valuation of farm system sustainability was drawn up. It comprises selection by the research team alone via "literature" review, "expert consultation" on choice of indicators or a collective decision via a "participatory process" such as focus groups or workshops. "Type of data" refers to whether "quantitative" or "qualitative" data, or "both", were collected to perform the assessment. "Data collection" refers to how data were collected for the assessment at farm level, e.g. in household or "farm surveys", "on-farm measurements" of parameters such as soil carbon and "expert opinions" on the value of systems. "Complexity of indicator" appraises the complexity of indicator calculation from "low" to "medium" and "high" complexity. Low refers to mainly simple variables, medium refers to a combination of variables and high refers to output of more advanced mathematical calculations, following the typology provided by Bockstaller et al. (1997). "Time for data collection" was split into three categories: "low" (<2 h), "medium" (2-7 h or one working day) and "high" (>1 day of data collection per farm).
Systemic dimension "Indicator interaction" refers to whether the tool accounted for potential relationships between indicators, calculating trade-offs and synergies. The potential "aggregation" refers to whether indicators were aggregated or not. "Weighting" refers to whether the indicators were weighted or not. These three variables are binary.
To analyse the dataset, we used three types of statistical approaches. First, we analysed the relationships among the 14 initial categorical variables that described the case studies, using Cramer's test. Second, we conducted multicorrespondence analysis (MCA) to decrease the complexity of the dataset. Third, we selected the main themes from the MCA to group studies based on their similarity using ascending hierarchical clustering (AHC) and Ward criteria (Ward 1963). This statistical method has previously been applied in a systematic review to identify groups of landscape modelling approaches of biodiversity (Chopin et al. 2019a).

Analysis of sustainability dimensions and themes targeted
In farm assessment, several dimensions are often used to provide information on the level of sustainability of the system rated (Binder et al. 2010). The description of these dimensions frames the sustainability problem to address. The systemic dimension of sustainability assessment refers to the representation of the main structures, processes and functions of the economic, ecological and social fields of the system studied (Binder et al. 2010). Economic sustainability is estimated from the production of goods and services, often represented by farm income. Social sustainability is approached as the rural function of farming. Environmental sustainability is estimated as the decrease in negative externalities and potential degradation of the environment.
For the papers selected, we described the tree structure in terms of sustainability dimensions, and, for each dimension, we listed their themes to capture different ways of framing the sustainability problem. In most papers, the sustainability problem is broken down into several sustainability dimensions that frame the issue of sustainability, and each dimension is broken down into themes representing the objectives which the farming system needs to achieve to be sustainable. In some methods, such as SAFA, sustainability is split into four levels (e.g. de Olde et al. 2016), i.e. dimensions, themes, sub-themes and indicators. In these cases, we chose to list the sub-themes, as they were more numerous and more precise in the type of impact addressed. For each paper, we listed the dimensions targeted as economic, social, environmental, agricultural productivity and institutional. We manually grouped the approaches based on the number of dimensions they targeted and the characteristics of these dimensions. Then, for each dimension, we listed all the themes associated with it. Once all the dimensions and themes were listed for each paper, we performed frequency analysis to assess how many times each sustainability theme appeared under a given dimension.

Indicator description
To determine how the impacts of farming were approached in the different publications, we focused on the type of indicator used in the same way as the dimensions and themes. We listed all indicators used for describing the impact of farming activities. We defined indicators as the most disaggregated variable used in a tool for assessing sustainability in the hierarchy of variables (de Olde et al. 2016). Indicators are quantities of very diverse nature (data, calculations, observations, measurements), which provide information about variables that are more difficult to gain access to, or about more complex systems (Diazabakana et al. 2014). They indicate performance or reflect changes related to activities, projects or programmes (UNAIDS 2010). Indicator types were defined using the Driver-Pressure-State-Impact parts of the DPSIR framework (EEA 2000). Responses (R of the DPSIR) were not accounted for here, as they are actions undertaken after the assessment. Drivers were defined as actions aiming at modifying practices of farmers. Practices of farmers were considered to place pressure on the farming system to modify its outcomes. Conditions of the environment were considered to be state variables. Impacts were the consequences of the change of state. As the variety of impact is important, we decided to focus on a subset of three themes commonly addressed in the literature and belonging to three different dimensions of sustainability, to see whether the nature of the dimensions conditioned the type of indicator used. We focused particularly on the themes "farm economic viability", "social equity and gender balance" and "biodiversity conservation" to observe the variety of indicator types used in the economic, social and environmental dimensions, respectively.

Diversity of tools and ways of improvements
We obtained five statistically different groups of approaches to assess sustainability at farm scale. The variables "Weighting" and "Orientation" were highly correlated with "Aggregation" and "Stakeholder assessment", respectively, and were therefore removed before the analysis. Thus, we ran the multi-correspondence analysis on 12 variables and obtained a set of five themes explaining 70% of the variance in samples. We performed AHC and applied the Ward criteria to select the typology maximising the inter-group variability and minimising the intragroup variability. We describe the different groups below based on Table 2.

Long-term monitoring of farm activities (n = 12)
This group represents tools that use long-term monitoring of farm flows to assess the sustainability of the system, often using complex procedures. The studies in this group typically employ a sustainability framework from the literature and implement it without any contribution from stakeholders in terms of definition of indicators or sustainability themes targeted. Multiple measurement tools such as field monitoring, field observation or self-recording by farmers are implemented for long periods ranging from several weeks to months (Dalsgaard and Oficial 1997) or multiple years (Pretty et al. 2008) to follow trends in indicators. Soil sampling can be done to measure, e.g. the status of nitrogen in soils, or several surveys can be performed to record information on economic return. Data collected are quantitative. Once the data are collected, modelling frameworks are used to process the data, such as life cycle assessment (LCA) or emergy-based analysis. Both these frameworks give the highest weights to the environmental dimension in the sustainability assessment (Rodrigues et al. 2010;Jan et al. 2012;Repar et al. 2018;Acosta-Alba et al. 2019). In some cases, simulation models are used, e.g. to follow the flow of nitrogen in the system (Prado et al. 2011). Another example is the ECOPATH farm model which uses networks of stocks (i.e. soil, plants, and animals) and flows (i.e. inputs, outputs and recycled biomaterials) that connect stocks (Dalsgaard and Oficial 1997;Muthuprakash and Damani 2019).

Table 2
Description of the five groups of approaches and the most significantly frequent level(s) of each variable. The current literature on sustainability assessment of farming systems is mostly oriented towards survey assessment with simple indicators that can only provide a limited snapshot of the sustainability of farm. Overrepresentation, which is the significance of the relationship between one of the variables and the group type, was tested with V-test. A slight overrepresentation of the factor value in the group is represented with the symbol "*" a moderate overrepresentation with "**" and an important overrepresentation with "***" (Husson et al. 2015). No overrepresentation is indicated with This group represents tools that assess ex-ante the impact of various changes in agricultural policies on farm sustainability using bioeconomic models. There is no consistency within the group on the level of participation by stakeholders. The tools use an average level of complexity of indicators, with some combinations of variables and mostly indicators from the literature. The models are built using data from farm surveys on farm structure, activities, cropping system characteristics and prices, and test the impact of innovations on farm sustainability to serve as policy advice (Florin et al. 2012). Efficiency analysis is often employed to evaluate whether specific farms perform better than others (Mu et al. 2018). The tools are strictly quantitative and can be used for policy advice in a goal-oriented framework, for instance to reduce nitrogen leaching from the system (Belhouchette et al. 2012).

Group 3: survey-and indicator-based assessment tools (n = 43)
This group gathers the most frequently used tools and represents survey-based assessment of farm sustainability. The problem of sustainability is defined without any stakeholders, with only simple indicators that can be calculated with a simple formula like a sum or a ratio. Examples include agroecological indicators, such as efficiency measures (Felice et al. 2012), or practice-based indicators, such as use of mulching (Moore et al. 2014). Short to medium-long surveys are conducted in face-to-face interviews or using directly available databases such as the EU Farm Accountancy Data Network (FADN), which provides a platform for collection of farm statistics across Europe . Information collected is mostly quantitative and used for both farm analysis of strengths and weaknesses and policy advice (Tellarini and Caporali 2000;Dantsis et al. 2010).

Group 4: consultation-based assessment (n = 26)
This group gathers studies where stakeholders have been consulted on selection of indicators and on the definition of the problem of sustainability. This group encompasses frequently used tools such as the RISE tool (Response-Inducing Sustainability Evaluation), which is an indicator-based tool for holistic assessment of sustainability of agricultural production at farm level (Berbeć et al. 2018), or the SMART tool (Sustainability Monitoring and Assessment Routine -Farm Tool) (Schader et al. 2016), which is similar to the RISE tool. These tools assess the sustainability of agricultural production at farm level using comprehensive information on ecological, economic and social aspects collected through a questionnairebased interview with the farmer (Grenz et al. 2012). Most of the data used are simple indicators, derived from data collected through on farm visits or direct opinions of stakeholders gauged using the Delphi methodology (Escribano et al. 2018). Such assessment uses both quantitative and qualitative information with practice-based indicators.
3.5 Group 5: active engagement of stakeholder-based assessment (n = 15) This group gathers studies where stakeholders are actively engaged in selection of indicators. Several participatory processes are proposed, such as participatory vulnerability assessment (PVA) (Chiwaka and Yates 2005) as employed by Veisi et al. (2015) or joint analysis (Coteur et al. 2018), with discussion on choice and structure of indicators (Troiano et al. 2019). In most cases, indicators or direct opinions of stakeholders, e.g. farmers (Goldberger 2011), are used together with simple indicators collected via farm surveys and personal views on the level of performance of their farm. This type of assessment uses mostly qualitative information with expert rating or practice-based indicators, which are calculated based on information on farmers' adoption of given agricultural practices. It differs from group 3 "Surveyand indicator-based assessment tools" by being more participatory in the choice and assessment of externalities of farming systems, but also in that the type of tool used with AHP allows the ranking of practices beneficial for the sustainability of farms. We found that participation varies greatly among approaches with our categorical ordinal variable that describes the level of stakeholder engagement from "no participation" to "consultation" and "active involvement". Stakeholder participation is often presented as a binary process (with/without) (Binder et al. 2010;Schader et al. 2014), but stakeholder engagement in the assessment process is more variable. When using tools in group 4 "Consultation-based assessment", researchers only consulted stakeholders at a single phase of the sustainability assessment, such as rating farm performance, while in group 5 "Active engagement of stakeholder-based assessment", stakeholders collaborated on data collection and contributed expertise and opinions in selection and rating of indicators. Involving stakeholders from the beginning of an assessment helps integrate various views on system functioning and local-level knowledge on aspirations and challenges to development (Reed 2008). Ultimately, stakeholder involvement encourages ownership of assessment outcomes and plans, which often seem incompatible but can actually lead to win-win or compromise situations in local agricultural development (e.g. Chopin et al. 2019b). The difficulty in describing the involvement of stakeholders in all phases of tool development and application prevented us from capturing the involvement in each phase of the tool development and use. In the most intensive involvement, stakeholders participate in defining indicators, aggregation methodology, timing and purpose of the evaluation, as in the case of the MESMIS tool (Lopez-Ridaura et al. 2005).

Need for more participation in farm assessment
Top-down tools generally lack emphasis on the local context. This type of tool encompasses the largest group of approaches (group 3 "Surveyand indicator-based assessment") and lack of stakeholder involvement can lower the relevance of results and uptake by end users (Binder et al. 2010). Participation (group 5) opens up issues, offers descriptive insights into local complexity (Veisi et al. 2014) and has been recognised as key to enhancing farmer engagement in sustainability assessment (Dong et al. 2016). It is known that in integrated participatory or transdisciplinary approaches stakeholders are included throughout the whole process, ensuring the reliability and suitability of tool implementation, acceptance of the results and probability of implementation of practices (Binder et al. 2010). One of the faults of the bottom-up approach is the inconsistency of indicators from one case study to another, which prevents comparison of the sustainability level among different regions and different farming systems. It also prevents benchmarking approaches. In the present analysis, the lack of variables to assess stakeholder involvement in various stages of tool development and use prevented us from providing specific general guidelines on this. However, a recent analysis of practical use sustainability assessment tools at farm level revealed that they only partially support farmers in their strategic decision-making, which reduces the likelihood of farmers adopting practices that could increase sustainability (Coteur et al. 2020). However, Coteur et al. (2020) acknowledged that tools have been evolving towards improved context-specificity and flexibility for use by farmers. Comparisons of tools by other authors, such as de Olde et al. (2017), have provided more accurate information on some variables, such as time of data collection, than was possible in the present case.

Combining tools and indicators for full assessment of farms
There are clear differences between the groups in terms of tool function and whether they target improvement of farming practices by identification of "strengths and weaknesses" or "policy advices". Group 3 "Surveyand indicator-based assessment of tools" can uncover strengths and weaknesses in farming systems, and these tools are comparable to "Rapid assessment tools" that focus on communicating and learning about sustainability and how it can improve within farming systems ). Group 5 "Active engagement of stakeholder-based assessment" falls into the same category of "Rapid assessment", although it can also be used for other functions and with a broad range of stakeholders (e.g. farmers, experts, advisers) that provide their view on the sustainability of practices and issues at stake. On the other hand, full sustainability assessments that make use of detailed farm data and/or expert information help in comparing farms in a benchmarking approach and can stimulate social learning. Tools from group 1 "Long-term monitoring of farm activities" correspond to some extent to a full sustainability assessment and have a better system representation that can help compare farms with different types of production systems ). The increasing complexity and insights into farming systems in groups 1, 3 and 5 can be located in the "Complexity" framework proposed by Coteur et al. (2016), which encompasses three levels of function. Level 1 uses tools from our group 3, with a rapid and basic assessment of sustainability to trigger interest in sustainability. Level 2 uses tools from our group 5, where multiple stakeholders are involved and farmers have more insights into their farming practices. Level 3 tackles specific problems with more extensive data collection and monitoring of farms with tools, as our group 1. This could be complemented with tools from group 2, where farm models could be used with farmers to see the effects of implementation of alternative farm practices and create the conditions to help farmers change their practices.
We found that aggregation does not discriminate the different tools and is actually applied statistically in some tools in each of the groups we identified. Some tools in all groups (1-5) aggregate indicators using tools such as additive value function or factor analysis (see e.g. Muthuprakash and Damani (2019), Jamalimoghaddam et al. (2019), Talukder et al. (2016), d'Errico et al. (2018 and Jacobi et al. (2018) for groups 1 to 5, respectively). Other typologies of sustainability assessment tools emphasise the importance of aggregation as a discriminating criterion (Gasparatos and Scolobig 2012). While bias of aggregation is regularly pointed out (Morse and Fraser 2005), it is useful for clarifying and simplifying data. Aggregation should only be seen as one way among others to facilitate exploration of data, while comparison of individuals using simple indicators should remain the most important part of the analysis (Morse et al. 2001). We also recommend testing several aggregation procedures in a sensitivity analysis to observe the extent to which farms perform differently under different aggregation. Although aggregation of indicators is used extensively, there is a spirited debate over the conceptual and methodological parameters employed (Cherchye et al. 2007). Before aggregation, normalisation takes place, using one of several different techniques with their advantages and drawbacks. Care is needed in choice of normalisation procedure (e.g. rank, distance to target, Zscore, max-min, proportionate methods), weighting and aggregation (e.g. equal weight and budget allocation), based on the question and purpose of the study. It is also important to be transparent about these choices, as they can have a great influence on the final values of composite indicators (Petkovová et al. 2020).
This typology of sustainability assessment tools revealed potential for development of tools showing the current state of sustainability of farms and valuing their impact in a dynamic manner, but with the need for more resources in terms of data collection. Group 1 "Long-term monitoring of farm activities" relies on a very intense data collection process for modelling and calculating system functioning. This limits the number of farms that can be monitored in terms of sustainability and requires some models to represent the system functioning. On the other hand, it provides more than a snapshot of sustainability level of the system. For group 2 "Ex-ante assessment of sustainability with bio-economic models", there is the same need for model development, validation and data collection. In both approaches, the number of indicators investigated is rather limited, but the range of values that can be explored ex-ante by means of simulation is important. We recommend that model-based assessment uses long-term monitoring of farm outcomes to create indicators for use in future simulations on the impact of innovation introduction or new policies.
The tools in group 3 "Survey and indicator based assessment of tools" are quick, and thus particularly suitable for rapid diagnosis of sustainability, and they can integrate different types of indicators, namely pressure or impacts. In most tools of this group, data treatment is still needed, so feedback cannot be provided directly to farmers for a specific case. Such tools linked to a decision-support system could provide direct results of the assessment to be discussed with a trained advisor, as in the RISE tool (Grenz et al. 2012). They can be considered rapid sustainability assessments that raise awareness among farmers of key farming practices and highlight the strengths or weaknesses of farming systems in terms of sustainability with tools such as Pg Tool or COSA . They could also form the basis for self-assessment of sustainability by farmers that lead to suggestions for changes in practices (e.g. Sustainable Agriculture Initiative (SAI) platform https://saiplatform.org/fsa/).
In group 5 "Active engagement of stakeholders based assessment", tools rely on expertise, and particular attention should be given to the choice of experts and level of expertise required. In these tools, indicators are of low complexity, and some models could eventually increase this complexity to provide more insights into the state and potential impacts of the farming systems. These models could be participatory simulation models, which would ensure continuing interest and learning for stakeholders (Smetschka and Gaube 2020).
The tools reviewed clearly have an orientation towards a given dimension of sustainability. Hence, group 1 "Long-term monitoring of farm activities" focuses on environmental sustainability, where biophysical processes are described in a precise manner, but socioeconomic aspects may be lacking or simplified. Despite increasing general interest in social aspects of sustainability, Social-LCA (Benoît-Norris et al. 2011), Life Cycle Costing (Dwaikat and Ali 2018) or other similar methods were not used in any of the papers reviewed in the present analysis. There are still several limitations of social indicators in LCA, such as the absence of clear definitions and a comprehensive evaluation of sustainability within Life Cycle Sustainability Assessment frameworks (Subramanian et al. 2018). Another major limitation when it comes to tools oriented towards the environment is the poor emphasis on labour, both considering the costs along the supply chain and in terms of workload (Pena and Rosa Rovira-Val 2020). Labour represents a good proxy of economic and social issues. Agricultural economists focus on labour efficiency and rural sociologists on division of labour by gender, but such issues are generally not addressed by environmental scientists performing sustainability analyses (Malanski et al. 2019). Environment-oriented tools currently lack indicators that reveal the social perspective on sustainability, as pointed for LCA by van der Werf et al. (2020) regarding land degradation, pesticide issues and biodiversity losses. According to Arulnathan et al. (2020), a better justification of choice of themes in relation to the local context would increase the relevance and confidence of the assessment results for end users. The present analysis of dimensions and themes in each method provides more insights into the array of frameworks that can be adopted to assess sustainability and the diversity of themes per dimension.
6 Adopting more complex sustainability framings to address the entire complexity of farming systems Seven ways of framing the sustainability problem were identified, depending on the number of dimensions addressed and the characteristics of the different dimensions targeted (Fig.  3).
The "classical view" incorporating the economic, social and environmental dimensions of sustainability was used by nearly two-thirds of the studies (Fig. 3A). The names of the dimensions may slightly differ between studies. For example, the environmental dimension can be called ecological, agroenvironmental or agro-ecological, but generally refers to the same set of themes, and the social dimension may be called community or territory. Another type of study is similar but incorporates a dimension dedicated to production, called agronomy or technical system, which represented 6% of the sample (Fig. 3D). In this group, one dimension can be missing, which makes it closest to the group called partial view of sustainability (Fig. 3B). This gathers sustainability assessments framed on two dimensions only, always with the focus on the environment and balanced with one of the other dimensions. The "governance oriented view" represented 13% of the sample (Fig. 3C). It adopts the same framing as the classical view of sustainability, but introduces an institutional dimension that assesses how the farm fits in a larger institutional and political context. The "livelihood view" comprised 5% of the studies reviewed. Some recently established system-oriented approaches, here called "systemic view", use system themes of autonomy, adaptability, productivity, equity or stability that cannot be considered social, environmental or economic (Fig. 3E, F). Another framework used in a few studies employs a more embedded perspective of sustainability (Fig. 3G), where impacts of the farming system on workers, the local community and society are considered (De Luca et al. 2015).
In this analysis, 61% of the tools reviewed rely on a characterisation of sustainability as a co-evolution of economic, social and environmental systems (here called "classical view of sustainability"). This represents a "triple bottom line" (René 1979;Kastenhofer and Rammel 2005;Adam 2006) historically used in farm sustainability assessments (Hansen 1996). Other perspectives on sustainability have been adopted to a minor extent. For instance, a governance or institutional sphere is sometimes introduced following the "Four Spheres" perspective from the formalised "Tetrahedral Model" (O'Connor 2006). It was introduced in 1996 when the United Nations provided a set of Indicators of Sustainable Development covering the social, environmental, economic and institutional dimensions (United Nations 2007). In this model, good governance and links to institutions are required to achieve a sustainable state of the agricultural system. This governance-oriented sustainability clearly states values and responsibilities of the farm and ensures its transparency and accountability. Governance plays a significant role in ensuring productivity, efficiency and equity in agricultural systems (Dasgupta and Roy 2011). Effective governance is considered the basis for agricultural sustainability (Vanloon et al. 2005;Talukder et al. 2020). The broader perspective, including institutional context in the assessment of farms, is based on the Fig. 3 Proportion of studies reviewed (n = 96) that used each of the seven framings (A. to G.) of sustainability at farm scale identified here based on the type (e.g. economic, institutional, financial, self-reliance) and number of dimensions targeted (from 3 to 6 depending on the type). We show that most studies (62%) use a "classical view of sustainability" which is limited in terms of dimensions targeted notion that farms are not closed systems but operate in interrelations with a larger socio-ecological system. This means that the externalities of farms need to be captured in the assessment. The use of the themes "Contribution to society" and "Higher contribution to society" shows that this interrelation has been formalised by splitting the assessment of farms into an "autocentred" vision of the sustainability of the system, where "farm" is a closed system, and a vision that is referred to as the "contribution to sustainable development", where "farm" interacts with a larger system (Alkan Olsson et al. 2009). We recommend that practitioners account for institutional/ governance sustainability when approaching farm sustainability and seek to capture both the sustainability of the farm itself and its contribution to sustainable development of the socioecological system and the larger society for various functions. Such indicators can be used for communicating with farmers, suggesting farm practice changes and formulating policy advice embracing the different levels of complexity of farm assessment tool described by Coteur et al. (2016). As complexity increases in farm assessment, indicators providing insights into farming system functioning need to be adapted to the level of complexity of the tool and function . One example at the first level of complexity, which corresponds to communication of information to farmers, is the tool used by Dodd et al. (2008). It questions governance by simple indicators of, e.g., compliance of the farm management plans with regulations or landowner independence. At the third level of complexity, suggesting farm functioning change, SAFA has built on existing corporate social responsibility to establish the governance dimension, which captures various properties of farming systems, namely corporate ethics, accountability, participation, rule of law or holistic management.
The agronomy-technical dimension, here referred to as the "classical productive view of sustainability", shows the necessity for agricultural systems to be economically efficient but also to fulfil their primary function of producing food. Under this framework, farms with low production levels, despite good economic, social and environmental outcomes, would not be considered sustainable. Having agronomic productivity as a dimension moves the focus from an environmental perspective to a production perspective, to show the importance of farms being sustainable per se, but also contributing to a higher socio-ecological system. Concretely, this can be done with an indicator, such as yield of the main product.
The "livelihood-oriented view" provides more insights into the household characteristics that can support the sustainability of the farming system. This framework is mostly applied in developing countries, but it could also be used in developed countries, where household characteristics such as other sources of income represent a coping strategy to face extreme events in terms of financial capital, and thus affect the sustainability of the farming system. The "systemic view" approaches frame farm sustainability with dimensions radically different from other frameworks where economic, environmental and social are grounded in system dynamic functions that ensure its viability over time, including "viability", "risk", "stability" and "vulnerability". Such themes are of particular relevance when assessing the ability of a farming system to cope with climate and market disturbances (Tittonell 2014). De Luca et al. (2015 adopted an embedded sustainability view, with human embedded into community and community into society. This operational model with dimensions grounded into each other, called strong sustainability (Ott 2003), considers an economy relying on society, which in turn relies on the environment. This type of grounded perspective is not applied in any of the tools found in the literature. Most sustainability assessment approaches actually consider the different dimensions independently from each other, rather than in combination. Hence, low environmental sustainability is not pertinent in the economic sustainability rating. While this type of embeddedness of the sustainability problem has been presented theoretically, studies in practice at farm scale seem reluctant to introduce this dependency among the different dimensions.
7 Integrating indicators for institutional sustainability and system properties to better represent the externalities of farming systems In total, 64 different themes of sustainability were addressed in all tools, with on average 10.4 ± 4.2 per tool. The number of themes under each dimension was 8.1 ± 3.7 for environmental, 3.6 ± 1.8 for social, 2.7 ± 1.5 for economic, 3.7 ± 2.0 for governance and 4.5 ± 3.2 for agronomic dimensions (Fig. 4). The diversity of themes was higher for the economic and social dimensions (20 themes each) than for the environmental dimension (11 themes). The dimension of governance was employed 11 times with eight different themes. The agronomy dimension was employed 12 times, with five themes addressed (Fig. 4).
The environmental dimension was most frequently addressed, but only 11 themes were used, mostly related to "Agricultural management" (46), "Soil" (31) and "Water quality and quantity" (30) (Fig. 4). The social themes were employed 120 times for 20 different themes. Four themes in particular stood out, "Working conditions" cited 20 times, "Social involvement", "Higher contribution" and "Social capital". The remaining 16 themes were not often employed in the papers reviewed and included family characteristics (e.g. "Equity", "Culture", and "Food self-sufficiency") or farm-based properties (e.g. "Autonomy", "Human capital"). For the economic themes, the pattern was quite similar to the social dimension, with three themes frequently addressed, "Viability", "Profitability" and "Productivity", and another group with 8-10 citations, namely "Autonomy", "Efficiency", "Investment" and overall "Economic performance". A range of little-addressed themes referred to the dynamics of the farm system: "Risk", "Stability", "Vulnerability", "Farm structure", "Resource self-sufficiency", "Market accessibility" and "Subsidies".
There was some overlap among the different dimensions in terms of themes addressed. In particular, the social and institutional dimensions shared some common themes, e.g. "Contribution to society", "Ethics and human development" and "Culture" and "Social involvement". Similar overlapping appeared between the social and economic dimensions, with "Labour" (Working conditions), "Autonomy" and "Revenue and expenses" (Profitability), which clearly shows the interlinkages between the social and economic conditions in farming.
The scope of assessment could be broadened by integrating indicators for institutional sustainability and system properties to better represent the externalities of farming systems. Institutional indicators cannot be assessed in bio-economic models used in group 2 but could be side indicators assessed via expertise. System properties would require dynamic models able to assess the stability of systems facing disturbances. When it comes to resilience, as external drivers such as climate change or market volatility become stronger, adding related themes will become more important to capture the resilience of a system, its viability and its robustness. . Themes were linked to dimensions based on the dimension under which the respective study categorised them, which explains why some themes appear in two dimensions. Social and economic display more themes on average than the environmental themes, although less cited, which shows the lack of a complete picture on social and economic aspects of farming in most assessments Many indicators capture resilience for cropping systems (Urruty et al. 2016), but not for farming systems, despite a growing need (Tittonell 2014).
We found that the diversity of themes was higher for economic and social dimensions, despite fewer themes being addressed on average in each tool, which shows less consensus on the themes required to address these two dimensions compared with the environmental dimension. This can be explained by the fact that environmental indicators have undergone a boom (Riley 2001), which has certainly led to more clarity on the various impacts of agriculture on the environment and has eased their assessment with methodological developments. Social indicators have undergone less development (Boström 2012;Latruffe et al. 2016). Our results on the diversity of themes can potentially be used as a guide so that practitioners (those performing the assessment) can see the variety of sustainability themes that can be assessed under the social and economic dimensions. Social sustainability can be approached by widening the perspective of tools to capture personal characteristics of farmers, including "experience", "identity", "family" and "human capital" or "human conditions" that are drivers of farm viability, as used, e.g., in the Sustainable Intensification Assessment Framework (SIAF) (Musumba et al. 2017). Economic sustainability can be broadened by including more aspects referring to the interaction between the farm and the larger socioecological system, including farmer population, society, markets and higher governance, such as "self-sufficiency", "subsidies", "multi-functionality", "market accessibility" and "economic contribution".

Moving from driver, pressure and state-oriented indicators to impact indicators
Overall, a large diversity of indicators was used for the three themes "Biodiversity", "Equity" and "Viability", but covering mainly "pressure" and "state" of the system in the DPSIR framework (Fig. 5). Only two out of 29 indicators were able to value the impact of farming systems on the sustainability of farming or the larger socioecological system. In proportion to the number of citations, the "Viability" theme had the fewest indicators, with only four found. This shows that metrics are largely shared among scientists to assess the viability of farming. One of the most used is farm income, a state variable of the potential viability of the farm, which was assessed directly by some papers in a very qualitative manner by asking experts whether the farm had a good level of income or not. Some indicators of stability of income were also approached by qualitative appraisal. Only one potential driver of viability was identified in the literature studied, namely insured area, which could help a farm remain viable despite unexpected events (Gómez-Limón and Sanchez-Fernandez 2010). For equity, a larger range of indicators embracing both the gender balance in activities and revenue and the equity of rights among people involved in farming was used. As a state indicator, direct participation of women in activities, gender-based income differential and overall level of equity was approached qualitatively. There was no indicator of a practice that could actually enhance equity and gender balance. Only one driver to increase potential implementation of such practices at farm Fig. 5 Listing of all the 29 indicators extracted from the 124 papers for the assessment of biodiversity conservation, social justice and equity that shows the lack of indicators for approaching impacts of farming systems. Most indicators are pressure-or practice-oriented (31%), supposedly correlated to the final impact or state of the system (50%), and may not provide sufficient information on whether the assessed farming system has a positive impact. Photo credit: Jeff Black level was mentioned, which was promotion of social justice/ equity (Goldberger 2011). Some indicators assessing "gender equality" could potentially be sought at farm level to capture the effective participation of women, not only in farming activities, but also in operational, strategic decisions related to these activities, asset management or cash access. Most indicators addressing the impact of farm practices on biodiversity were pressure and state indicators of various types. Crop, animal and genetic diversity was one category of state, assuming that the greater the state of diversity in crops and animals the more positive the impacts in favour of wild biodiversity. At farm level, eco-toxicity in relation to pesticide use and ecofriendly practices were also mentioned. Besides species-rich field borders, provision of beneficial habitats (e.g. natural patches), connectivity with natural land uses and higherscale landscape diversity around the farm were mentioned. Potential drivers of biodiversity were also highlighted, such as involvement of farmers in a biodiversity conservation plan and farmers' knowledge of biodiversity conservation, which was not actually measured in the paper that cited it.
The tools reviewed in this paper combine many different types of indicators, e.g. drivers, pressure, state and impacts in the DPSIR framework. For example, SAFA combines different indicators that are weighted more strongly if the information they provide is more precise. Thus, a full sustainable target-based indicator has a quantified score of 1, a practicebased indicator score of 2 and a performance-based score of 3 points. The rationale for sustainability assessment is that end users want to know the final impacts of farming systems on society and the planet. Some indicators contribute to assessing directly the impact of farming practices and are very precise, while some rely mostly on "pressure", "state" or "driver" indicators from the DPSIR framework. The relationship between pressures and drivers on one hand and state and impacts on the other hand is not linear and even the direction of change of impact with implementation of practices can sometimes be unclear. For instance, diversity of landscape and cropping systems is generally known to increase biodiversity richness, but counterexamples exist since this ecosystem service is context-dependent. Hence, for biodiversity conservation in context with low knowledge of the impact of crop diversity on wild biodiversity, impact indicators should be prioritised, such as the number of grassland species. For "social equity and gender balance", very few farm level indicators were used in the reviewed literature. Gender equity at farm level is currently assessed by looking at women's workload and women's governance aspects in terms of decisions, which are clear impact indicators. Solely promoting better gender equity with a "driver" indicator cannot ensure that gender equity is respected in farming contexts. More indicators could be produced and tested in different farming contexts to effectively capture empowerment of women in farming. Indicators assessing "gender equality" at farm level could capture the effective participation of women not only in farming activities, but also in operational and strategic decisions related to these activities, asset management or cash access. For farm viability, farm income as a state indicator is the most commonly used indicator. It provides a good picture of the profitability of the system, but its use without thresholds cannot really provide information on whether a farm is viable or not. Discussing thresholds with stakeholders depending on the farming system and context would help develop viability indicators with values that could provide information to end users on whether farming systems reach targets or not.

Conclusions and future outlooks
This review revealed that sustainability assessments at farm level are conducted with five different tool groups differing primarily in terms of model used for assessment and stakeholder involvement. These tool groups represent farm sustainability in very different ways, although the classical view of sustainability with economic, social and environmental sustainability dominates. Environmental themes are addressed consistently in the different groups, but for social, economic and governance/institutional sustainability there is no consensus on the themes to assess. In terms of indicators, we identified a lack of impact indicators for two themes, namely "social equity and gender balance" and "biodiversity conservation".
We recommend that practitioners performing farm assessments adopt a broader view of sustainability, including a governance/institutional dimension, and employ a larger list of components regarding social aspects related to farmer characteristics and system properties linked to the economic dimension, such as viability, risk, robustness and stability. All stakeholders should be actively involved in the framing of sustainability and in the choice of indicators, in order to achieve better transfer of assessment outcomes. Finally, impact indicators should be identified and tested in various contexts and applied in farm sustainability assessments to provide a better understanding of the impact of farming systems on farm sustainability and on larger socio-ecological systems, including society, markets and ecosystems. These recommendations should account for the purpose of the tool that can vary from communication of results to redesigning farming systems, the issues at stake in the context of the assessment and resources available for conducting the assessment. Funding Open Access funding provided by Swedish University of Agricultural Sciences. This project was funded by the Swedish International Development Cooperation Agency (Sida) through the AgriFoSe2030 initiative. This work is part of the CGIAR research programme on Grain Legumes and Dryland Cereals (GLDC).
Data availability The datasets generated during the current study are available in the Zenodo repository, 10.5281/zenodo.4429255.

Declarations
Conflict of interest The authors declare no competing interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.