How to quantify the impacts of diversification on sustainability? A review of indicators in coffee systems

Despite the potential of diversification strategies to achieve sustainability, diversified systems such as agroforestry are still not widely implemented by farmers, which indicates the need to further understand and adequately assess the impacts of diversification to inform the design of complex systems. In this paper, we conduct a systematic literature review focused on agroforestry coffee systems, to assess (i) how current methods and indicators are used to quantify the impact of diversification on multiple dimensions of system sustainability, and (ii) to assess the impact of diversification through coffee agroforestry on multiple dimensions of sustainability. Our analysis was based on 215 selected papers and all the indicators identified could be classified in one of the sustainability dimensions proposed in our framework: ecosystem services (57.2%), biodiversity (35.6%), input use (4%), socio-economic sustainability (2.7%) and resilience capacity (0.5%). Despite the broad scope of the indicators, individual studies were found to often lack interdisciplinarity and a systemic view on agroecosystems. Besides, not only were there few studies that included the impacts of diversification on input use, socio-economic sustainability and resilience capacity, but specific biodiversity attributes (e.g. functional diversity, landscape diversity) and ecosystem services (e.g. soil biological quality, water regulation, pollination) were generally underreported. The impact of diversification was more positive than negative in all dimensions of sustainability, with the exception of crop productivity. Yet, diversified systems are associated with reduced costs and high yields can still be achieved in diversified systems with appropriate agricultural management (e.g. adequate number and type of shade trees). Key to reaping the benefits of diversified systems is that the diversity of elements is carefully integrated considering the impact on multiple dimensions of system sustainability. A better understanding of synergies and trade-offs remains crucial for the customized design of diverse and sustainable systems for a variety of geo-climatic conditions.


Introduction
Diversification strategies, such as agroforestry and intercropping, have been increasingly considered as viable options to achieve more sustainable farming systems (Kremen and Miles 2012;Hufnagel et al. 2020). In the case of agroforestry systems, for instance, the diversity of shade trees is expected to provide a variety of ecosystem services (i.e. benefits derived from nature to people), contributing to socio-economic sustainability (de Souza et al. 2012), resource-use efficiency (Nair 2017) and resilience against environmental changes (Gomes et al. 2020). Studies claim that plant diversity is related to a higher associated diversity of organisms, including microorganisms as well as vertebrates and invertebrate animals (Scherber et al. 2010;Leakey 2014;Duru et al. 2015). The high biodiversity in diversified agroforestry systems has intrinsic value, but is also suggested to enhance ecological processes and, in turn, to help maintain and regulate a variety of ecosystem services, such as pollination, pest control, crop production and water regulation (Swift et al. 2004;Isbell et al. 2017b;Santos et al. 2019). The provision of ecosystem services is expected to influence socioeconomic factors, including monetary assets, and other issues related to well-being, such as food security and spirituality (Reyers et al. 2013;Heckwolf et al. 2021). In addition, agroecosystem functioning is associated with input use and requirements, in terms of pesticides, fertilizers and labour. Changes in input use associated with agroforestry can affect not only ecosystem properties and processes, but also farm economic and social sustainability (Jezeer et al. 2018;Rahn et al. 2018a). As agricultural systems are not static, the adoption and development of agroforestry systems is constantly facing socio-environmental changes, including climate change, and changes in consumer behaviour and market demands. The way systems react to disturbances and changes over time and their capacity to resist and recover from external shocks (i.e. resilience) is thus a key element of system sustainability (Meuwissen et al. 2019;Valencia et al. 2019;Córdoba et al. 2020). Despite the potential of agroforestry and other diversification strategies to achieve sustainability, diversified systems are still not widely implemented by farmers across the globe. For instance, Pretty et al. (2018) estimated that less than one-third of all worldwide farms adopt sustainable practices (including practices related to diversification) in approximately only 9% of the total agricultural land. Therefore, it is key to further understand and adequately assess the impacts of agroforestry (and other diversification strategies) on sustainability to inform the design of complex systems.
Sustainability can be defined as the capacity of agricultural systems to satisfactorily perform multiple functions continually over time across the environmental, social and economic domains (Trigo et al. 2021). Considering that sustainability entails social, economic and environmental aspects of farming, addressing broader impacts of diversification strategies requires an integrative and systemic perspective on ecosystem functioning and its multiple benefits for people. Despite the increasing body of theoretical and experimental studies on diversification (Beillouin et al. 2019), addressing its impacts using integrated assessments remains a challenge due to broad concepts and inconsistent definitions as well as a great variety of indicators that are used on different spatial levels and that are associated with different dimensions of system sustainability (Boerema et al. 2017;Bünemann et al. 2018;Hufnagel et al. 2020). For instance, recent reviews on diversification are useful to reveal general trends based on a large database of studies (Beillouin et al. 2019;Hufnagel et al. 2020;Tamburini et al. 2020), but do not combine indicators of ecosystem services with indicators of socio-economic sustainability and resilience capacity of agroecosystems, while the latter are relevant issues for policies and farmers. The use of indicators that compose a comprehensive and systemic framework addressing the multiple components of system sustainability can help not only to align research findings with policy and societal needs, but also to better understand how and to what extent specific diversification strategies such as agroforestry can contribute to multiple aspects of agroecosystems sustainability. Yet, it remains unclear how indicators found in the literature can be classified and applied to provide an integrated assessment of diversification.
In this paper, we propose a general framework to quantify the impacts of diversification on sustainability and use coffee agroforestry as a case study (Figure 1) to explore these issues in depth and then discuss how ideas thus conceived can be generalized. We chose coffee agroforestry because this system has received much attention in relation to diversification. Coffee is one of the most important commodity crops in the world, being a perennial and very climate-sensitive crop. Coffee is grown mostly by small-holder farmers, which makes it particularly vulnerable to climate, economic and other perturbations.
The central research questions of this paper are (i) how are current methods and indicators used to quantify the impact of diversification through coffee agroforestry on multiple dimensions of system sustainability? and (ii) what is the impact of diversification through coffee agroforestry on multiple dimensions of sustainability? We address these questions with a systematic literature review focused on coffee systems. The broad scope of studies and indicators found in literature were systematized and classified according to an integrated framework developed and applied in this study. The framework highlights the impacts of diversification on five dimensions of system sustainability: biodiversity, ecosystem services, socio-economic factors, input use and resilience. The framework served as a basis to identify general characteristics of the studies, which indicators are used considering multiple spatial levels of analysis, how these indicators relate to each other and how they relate to diversification strategies.

A framework to quantify the impacts of diversification
Our review was conducted based on an integrated conceptual framework to assess the impacts of diversification and to inform the development of more sustainable food systems ( Figure 2). The framework was initially developed based on discussions with specialists from different scientific disciplines and literature review of the main challenges associated with sustainable agriculture. Five main challenges were identified: (i) biodiversity loss (Scherber et al. 2010;Leakey 2014;Duru et al. 2015); (ii) low provision of multiple ecosystem services (Swift et al. 2004;Dainese et al. 2019;Wan et al. 2020); (iii) social and economic vulnerability (Reyers et al. 2013;Heckwolf et al. 2021); (iv) strong dependence on inputs (Jezeer et al. 2018;Rahn et al. 2018a); and (v) low resilience to socio-economic and ecological shocks or stresses, which are increasing in frequency and intensity (Meuwissen et al. 2019;Valencia et al. 2019;Córdoba et al. 2020). These sustainability challenges are associated with various goals of sustainable agriculture and development (United Nations 2015). For instance, biodiversity loss is linked to life on land; ecosystem services provision to zero hunger and clean water; reducing input use to responsible production; socioeconomic vulnerability to no poverty; and occurrence of shocks (such as climate change) is linked to climate action. Based on these five challenges and discussion among the interdisciplinary group of co-authors, a framework was developed to quantify the direct and indirect impacts of diversification on five main dimensions of sustainability: (i) biodiversity, (ii) ecosystem services, (iii) input use, (iv) socio-economic factors and (v) resilience capacity. As some dimensions are quite broad and encompass a large variety of indicators, we have also defined sub-categories within each dimension. The final set of sub-categories was determined based on an iterative process throughout the systematic review. In other words, although we previously defined sub-categories, those were fine-tuned according to the type of indicator found in the literature. During the iterative process, specific definitions were established for each of the sub-categories (see Section 3.1.2) In the framework, plant diversification is defined as management changes to increase plant variety in terms of structure and composition, considering multiple temporal and spatial levels (Beillouin et al. 2019). The impact of diversification strategies is generally reported in literature through the  comparison of different types of systems (e.g. diversified vs non-diversified or gradient of diversity) instead of analysing changes in a given system over time (De Beenhouwer et al. 2013). Therefore, in our review, we consider diversification as a gradient, and the impact of diversification is based on the assessment of systems with different degrees of diversity. Levels of diversity can be defined based on contrasting treatments (e.g. full-sun coffee system vs single tree species agroforestry vs multispecies agroforestry) or measures related to species diversity (e.g. species richness). Although the different dimensions can be quantified separately, they are not understood as independent system components, but rather as complementary and interconnected dimensions of an integrated network. Socio-ecological drivers that may impact diversification and other components of the system, such as local biophysical conditions and access to markets and policies, are also acknowledged in the framework. However, these codrivers of diversification and sustainability are not the focus of this review, as our main objective is to assess the impacts of, and not on, diversification. The various components of the system can be identified and/or quantified on different hierarchical levels, since the impacts of diversification can be measured at organism, field, farm and/or landscape levels. As the sustainability of agroecosystems is dynamic and constantly facing social and environmental changes, a temporal component (t) is highlighted in the framework to address the resilience of farming systems against disturbances.

Search strategy
The systematic review was based on a literature search carried out in June 2020 on the Web of Science platform. The search was restricted to articles published from 1945 to 2020 in peerreviewed scientific journals in English. The term used to search for papers was ((coffee*) AND (agroforest*)). These terms were applied to search 'article title', 'abstract' and 'keywords'.
The search yielded 651 papers. The title, abstract and keywords of each paper were saved in pdf format. Each paper was numbered and a first screening was conducted. This screening followed two criteria: (i) papers were removed if they were unrelated with the impacts of diversification strategies in coffee systems, and (ii) review studies were removed to avoid duplicates and because they had different specific criteria to select papers. This first screening excluded 327 papers. The 324 remaining papers were downloaded, and a full-text Figure 2. Framework to assess the impacts of diversification on system sustainability. Five main dimensions of sustainability are indicated in the coloured boxes: biodiversity, ecosystem services, input use, socio-economic factors and resilience capacity. The impact of diversification on the five different dimensions is represented with the black arrows. Socio-ecological drivers that may impact diversification and other components of the system are indicated in the grey boxes. Dashed arrows represent feedback loops among system components. A temporal component (t) is added to address how system sustainability responds to disturbances over time.
screening was conducted to further select appropriate papers according to additional specific eligibility criteria: (i) papers should compare diversified systems with non-or lessdiversified systems; (ii) the comparison could be based on two or more contrasting treatments (i.e. diversified vs nondiversified) or using a bi-or multi-variate relationship between one or more explanatory and response variables (i.e. response to gradual change in system diversity). After the full-text screening, the number of selected papers was narrowed down to 215.
Although this systematic review was not registered in an official database, a protocol was developed before the research began. The systematic review flow chart is available in Appendix B. Choosing just two terms ((coffee*) AND (agroforest*)) kept the search broad enough and avoided capturing papers that have nothing to do with our research question. This trade-off was defined early on by comparing the search results of a series of different search terms. As an example, a search on the papers published in 2020 using 'coffee* AND multifunction* OR coffee* AND trees OR coffee* AND biodiversity' returns no additional paper of interest as compared to the 'coffee* AND agroforest*' search.

Characterization of the selected papers
We documented general information for each study, such as main author, year of publication, journal, country (or countries) where the study was conducted, objectives, main conclusions and treatments. We then classified each study into one or more of the sustainability dimensions defined in our framework according to the type of indicators used. The same study can contain multiple indicators, and therefore, classified in more than one sustainability dimension. In addition, we assessed whether the study was based on predictive models, empirical quantitative and/or qualitative data; whether interactions between response variables were considered; and whether the study focused on contrasting treatments or on the bi-(or multi-) variate relationship between explanatory and response variables. In a second step, we recorded specific indicators used as response variables in each study, including the name of the measure, the spatial level of assessment and the unit of analysis. Each indicator was assigned to one of the sustainability dimensions established in our framework (Figure 2), and when needed, a specific sub-category. Each indicator was assigned to only one of the thematic clusters and in case of doubt, the choice was based on specific criteria and definition established in our framework. The response of each indicator to diversification was classified from a sustainability perspective as positive, negative, neutral or variable, according to the statistical results presented in each study. Responses were considered variable if the outcomes varied due to other drivers than diversification (e.g. location, environmental condition, coffee genotype). Responses that could not be directly linked to system sustainability were not reported. Responses with unavailable, inaccessible or unclear statistical results were not included. Individual studies often contained more than one indicator, and therefore, we recorded 1679 entries from the 215 studies (data added as Supplementary Material). The effect of diversification on multiple indicators was analysed at global and continental level.

Results and discussion
The structure of our results and discussion is presented in three different sections. In the first section, we present and discuss the geographical location of the 215 selected studies in this review. In addition, we characterize and discuss the methods and indicators used to quantify the impact of diversification on multiple dimensions of system sustainability. In the second, we assess the impact of diversification through coffee agroforestry on multiple dimensions of sustainability. And finally, in the third section, we propose general recommendations for future studies based on the insights gained in this review.
3.1 How are current indicators and methods used to quantify the impact of diversification through coffee agroforestry on multiple dimensions of system sustainability?

Geographical location of the studies
The number of studies that focus on coffee diversification through agroforestry has been constantly increasing across the years since 1993 (Appendix A). Of our 215 selected studies, most were conducted in Latin America (n=158), mainly in Costa Rica (n=48), Mexico (n=34), Brazil (n=32), and Colombia (n=15; Figure 3). Brazil and Colombia are, respectively, the first and third largest coffee-producing countries in the world, whereas Costa Rica and Mexico occupy the 15th and 13th positions in the world ranking, respectively (FAO 2020; Appendix C). Although countries such as Vietnam, Indonesia, India, Ethiopia and Uganda are top coffeeproducing countries (FAO 2020), a limited number of studies were conducted in Africa (n=35) and Asia (n=21). Yet, except for Vietnam, most studies within these continents were conducted in their top producing countries, India (n=12) and Indonesia (n=8) in Asia; Ethiopia (n=15) and Uganda (n=10) in Africa ( Figure 3). Therefore, although there is some overlap between national coffee production and number of studies, coffee diversification through agroforestry remains relatively little studied in some of the top coffee-producing countries in Asia and Africa. The lack of studies in these countries is probably due to a combination of factors, such as differences in research priorities, differences in regional coffee-specific cultivation practices, research funding/ infrastructure and availability of published results in English. Yet, the lack of studies in African and Asian countries does not necessarily mean that diverse coffee agroforests are not present in these regions. Therefore, not only the results of this review should be carefully interpreted due to its potential geographical bias, but future empirical studies are needed to quantify the impacts of diversification in less studied countries and further explore differences in global trends related to coffee agroforestry.

Indicators of sustainability
We found a wide range of indicators that were measured on different hierarchical integration levels to assess the impacts of diversification on system sustainability (Table 1). All the indicators could be classified in one of the sustainability dimensions proposed in our framework. Most indicators were assigned under the ecosystem services dimension (61.4%), followed by biodiversity (31.4%), socio-economic factors (4.3%), input use (2.3%), and resilience capacity (0.6%). Below, we conceptualize each sustainability dimension and discuss the type of indicators used.
Ecosystem services The ecosystem services approach was consolidated by the Millennium Ecosystem Assessment, in which a framework was established to analyse the impact of land-use change and biodiversity on human well-being. The framework considers humans as an integral part of the ecosystem, and that dynamic interactions occur between humans and other components of the ecosystem. Ecosystem services can be classified into four main categories: provisioning, regulation, cultural and supporting services (MEA 2005). The ecosystem services approach is at the core of our study because it allows an integrative and systemic view on the functions and benefits derived from agroecosystems to people (Mastrangelo and Laterra 2015;Mulder et al. 2015;Costanza et al. 2017). First, the approach is useful to assess the performance of ecosystems on different hierarchical levels and land uses (Braat and De Groot 2012a;Schulte et al. 2015). Second, it is intelligible and inclusive for farmers and other stakeholders, making research outcomes more useful and relevant for implementation . And third, it has been widely used by scientists from different fields to assess the performance of agricultural systems, allowing exchange and integration of knowledge (Steger et al. 2018).
Quantifying ecosystem services is a challenge and frameworks are often contradictory and have little consistency. Regarding the terminology, concepts such as ecosystem functions, ecosystem services, ecosystem disservices, ecosystem processes, ecosystem goods, and ecosystem benefits are used in a contrasting manner. For instance, while in some studies the concept of ecosystem services has great overlap with the concepts of functions and/or benefits Schulte et al. 2015), other authors make a clear distinction, between functions, services and benefits as well as their quantification (De Groot et al. 2002;Haines-Young and Potschin 2009), which may lead to confusion. Here we propose that ecosystem services should be broadly defined as direct or indirect benefits/goods to human well-being derived from the interaction between human and nature components. Examples of ecosystem services include provision of food, pollination, nutrient cycling, climate regulation and soil erosion control. In other words, we consider ecosystem services as goods and benefits provided by (agro-)ecosystems to people that can be quantified based on ecosystem functions, processes and properties Costanza et al. 2017). We also acknowledge the use of similar and complementary  Soil biological properties Microbial biomass carbon Model-based scenarios Changes in crop suitability, yields and water regulation concepts to ecosystem services, such as nature benefits, nature contributions to people (Díaz et al. , 2018, or other terminology that is most suitable according to the local context. The concept is kept broad to make it intelligible for a variety of stakeholders . Moreover, in order to streamline our framework, factors considered as ecosystem disservices are just turned around to be understood as ecosystem services (e.g. soil erosion → soil erosion control).
The majority of reported indicators found in our review were classified as ecosystem properties, processes or functions that can be related to specific ecosystem services, and therefore, were assigned under this sustainability dimension. Within the ecosystem service dimension, the most reported indicators were related to soil chemical quality (24.7%), crop production (14.6%), nutrient cycling (11.7%), soil physical quality (10%), pest control (9.4%) and climate regulation (8.6%), while there is less information about ecosystem services of carbon sequestration (5.5%), water regulation (5.5%), soil erosion control (3.8%), soil biological quality (3.8%) and pollination (2.2%; Table 1). No study focused on the provision of cultural ecosystem services, which highlights how little is known about the quantitative impacts of diversification on more social and abstract benefits, such as landscape aesthetics, local people lifestyle, spirituality and leisure (Howe et al. 2014;Boerema et al. 2017). It is also possible that studies focusing on the impacts of diversification on cultural services used different keywords/ phrases that eluded our search strings.
Agricultural production All humans depend on food and other products derived from agriculture to live. Food production is one of the main research foci in agricultural systems and a relatively large proportion of indicators found in our review were associated to this ecosystem service (8.3%, Table 1). Indicators of agricultural production were based on plant growth, plant vegetative and reproductive traits, product quality, light use efficiency and actual yields. Actual yield was measured both at plant and field level while the other indicators were mostly measured at plant level.
Pest control Pests and diseases occur in agriculture as a response to interactions among biotic and abiotic factors in the ecosystem (Bianchi et al. 2006). Levels of pest control were estimated with indicators that quantify the crop damage caused by a certain pest, the incidence of living organisms that are considered pests or diseases, and the incidence of living organisms that are considered natural enemies of pests and diseases. Indicators related to pest control were commonly measured at plant or field level, and in some cases scaled to larger spatial levels with the use of models.
Pollination The reproduction of most higher plants, including commercial crops, is highly dependent on pollination provided by wild pollinator species such as insects, birds and bats.
Without natural pollination, several plant species would go extinct and the losses in terms of agricultural production could be catastrophic (Potts et al. 2016). Pollination was quantified based on the visiting patterns of pollinator species and open pollination experiments. Indicators were either measured at plant or field levels.
Soil quality Soil quality can be defined as 'the continued capacity of soil to function as a vital living ecosystem that sustains plants, animals, and humans' (Bünemann et al. 2018), and is determined by agricultural management as well as pedoclimatic conditions (e.g. parental material, soil type, slope) Gomes et al. 2019). Therefore, management practices that favour high soil quality are crucial to maintain a stable and satisfactory crop production and to avoid environmental problems such as erosion and water run-off (Bünemann et al. 2018). Thus, soil quality is closely linked to the provision of other ecosystem services, like crop production, water regulation, nutrient cycling and carbon sequestration (Bünemann et al. 2018). Although soil quality can be considered as a key component that regulates ecosystem functions, rather than an ecosystem service per se, we decided to keep soil quality under the ecosystem services category. This is because many soil quality indicators are directly valued by society (for instance, soil chemical fertility is directly valued by farmers) and therefore, soil properties and processes linked to soil quality are commonly used in studies as indicators of ecosystem services (MEA 2005;Díaz et al. 2015;Costanza et al. 2017). Soil quality indicators were divided into four main sub-categories: soil chemical quality, soil biological quality, soil physical quality and soil erosion control. We decided to keep soil physical quality and erosion control as two separate categories, as the latter is referring to a specific process that results in soil loss. The division of soil quality in various categories is commonly used in literature and allows to better disentangle the multiple functions provided by the soil (Bünemann et al. 2018). The division also helps to identify which are the most used indicators and possible knowledge gaps.
Soil chemical quality Indicators of soil chemical quality referred to soil nutrient content, soil chemical properties and soil organic matter. Indicators of soil chemical quality were reported more often than indicators of physical and biological quality or soil erosion control. Indicators were always measured at field scale.
Soil physical quality Soil physical quality referred to soil texture, soil structure and soil water content. Although we included all reported indicators of soil physical quality in our analysis, soil texture is expected to hardly change in response to management while soil structure and water content are expected to be more sensitive to management interventions. Indicators were always measured at field scale.
Soil biological quality Indicators of soil biological quality are suggested to be more sensitive to changes in management than chemical or physical quality (Bending et al. 2000). Despite the importance of soil living organisms for ecosystem functioning, relatively few studies reported indicators on biological quality. Reported indicators either refer to the presence or colonization of a certain group of microorganisms considered beneficial for ecosystem functioning as well as soil biological properties and processes. Indicators were always measured at field level.
Soil erosion control Soil erosion is a major problem in agricultural systems, especially in hilly landscapes (Seutloali et al. 2017). Measures of soil erosion control were based on soil cover and soil erosion intensity. Indicators of soil cover are reported more often than soil erosion intensity, as measuring erosion intensity can be labourious and imprecise (Seutloali et al. 2017). Besides, indicators were generally measured at field level, but in some cases scaled to landscape level with the use of models. For instance, Verbist et al. (2010) combined the use of models and field measurements to quantify factors affecting soil erosion at catchment scale.
Nutrient cycling The re-cycling of chemical elements that occur in nature is crucial to maintain the functioning of natural and managed ecosystems. The importance of nutrient cycling for agriculture and human well-being is often related to the maintenance of healthy and productive soils as well as regulation of gas emissions and nutrient losses (Steffen et al. 2015;Tully and Ryals 2017). Therefore, there is a strong relationship among nutrient cycling and other ecosystem services such as soil quality and carbon sequestration. In this study, we consider that indicators of nutrient cycling are expressed as a process (rate) that involves nutrient dynamics, while soil nutrient content and carbon-related indicators fall under the scope of other ecosystem services. Indicators related to nutrient cycling were measured at field scale and refer to nutrient input through plant material or natural processes (nutrient transformations and nutrient losses).
Water regulation Water regulation refers to the regulation of hydrological flows, aiming to avoid water losses and guarantee the supply of water in terms of quantity and quality, both for agriculture and human consumption (Fisher et al. 2009). Indicators of water regulation were associated with plant physiology, water dynamics at field level and water quality. Indicators were mostly measured at field level, but also at plant level in the case of plant hydraulic dynamics, and at landscape level in the case of water quality.
Climate regulation The climate is changing and affecting farmers and their agroecosystem as a whole (Kurukulasuriya and Rosenthal 2003;Van Noordwijk 2018). Therefore, indicators related to climate regulation are of crucial importance to understand and design more resilient systems. Indicators related to the capacity of the systems to alter the local climate were measured at different spatial scales, from field to landscapes. Generally, indicators reflect quantitative changes in air, soil and plant temperature, air humidity and wind intensity. Results at landscape level were obtained based on estimations from models.
Carbon sequestration Although carbon sequestration is closely related to climate regulation, we decided to keep it as a separate category due to the high relevance of the topic on the media and policies as well as carbon-based programs of payment for ecosystem services (Tscharntke et al. 2011;Minasny et al. 2017). Carbon sequestration is a process and therefore, it should be ideally measured as a rate (Boerema et al. 2017). Although studies reported indicators that estimate carbon sequestration through time, most indicators were static and focused on one-time measurements of above-and belowground carbon stocks (e.g. soil and vegetation carbon stockton ha -1 ). Since carbon sequestration is more relevant at field or landscape level, no indicators were measured at plant level.
Biodiversity Biodiversity can be defined as 'the variability among living organisms (...) and the ecological complexes of which they are part: this includes diversity within species, between species and of ecosystems (United Nations 1992)'. Biodiversity has its own intrinsic value, and some studies consider biodiversity-related properties (e.g. habitat for wildlife) as ecosystem services. However, biodiversity is a key component of agroecosystems not only due to its intrinsic conservation value, but also because it plays a central role to regulate the provision of multiple ecosystem services (Isbell et al. 2017a, b). Therefore, in accordance with recent frameworks (MEA 2005;Balvanera et al. 2014;Díaz et al. 2015;Costanza et al. 2017), biodiversity is not conceptualized in our study as an ecosystem service, but rather an important component that regulates the provision of ecosystem services. The scientific debate on the role of biodiversity for regulating ecosystem functioning in agricultural systems became stronger in the 1990s, with novel theoretical insights (Giller et al. 1997;Altieri 1999) as well as field experiments (Tilman et al. 1997). Since then, the number of papers on the topic only increased and the debate gained momentum with the millennium ecosystem assessment (MEA 2005), which introduced a framework that makes explicit the links between biodiversity and human well-being. This framework has evolved into more recent attempts of the Intergovernmental Platform on Biodiversity and Ecosystem Services (IPBES; Díaz et al., 2015) as well as other groups of researchers (Costanza et al. 2017) that aim to create an inclusive, interdisciplinary and action-oriented approach to inform the development of more diversified agroecosystems capable to provide multiple ecosystem services.
Plant diversification is defined here as management changes to increase plant variety in terms of structure and composition, considering multiple temporal and spatial scales. These changes may increase biodiversity at different levels, including genetic, species and ecosystem diversity. Therefore, the process of diversification is intrinsically linked to planned changes in the diversity of agroecosystems. In the case of agroforestry systems, for instance, planned changes at field level can be characterized by the number and type of trees that are intentionally incorporated by farmers in the system, while planned changes at farm level may involve the proportion of farm area occupied with agroforestry. Changes in planned diversity may also involve non-crop plant species, as these may be selected by farmers to perform a specific function in the system (e.g. leucena to fix nitrogen and produce biomass). Moreover, planned changes in diversity and management are expected to influence the associated diversity in the system, which refers to all organisms that inhabit or colonize cultivated areas from surrounding landscapes, such as the spontaneous vegetation as well as birds, bats, insects and other groups of animals (Altieri 1999). The associated diversity is also influenced by landscape diversity, which includes the composition and configuration of different elements on the farm and the landscape surrounding the farm (Duru et al. 2015).
Understanding the impacts of diversification strategies on biodiversity, but also the effects of biodiversity on ecosystem functioning, requires the use of complementary indicators. Such indicators should be able to quantify the multiple attributes of system diversity, including taxonomical, structural, functional and landscape diversity (Díaz and Cabido 2001;Balvanera et al. 2014;Duru et al. 2015). Within the biodiversity dimension, indicators of structure and taxonomy were the most used (45.5 and 35.4% respectively; Table 1), while functional diversity and landscape diversity were underrepresented (17.6 and 1.5%, respectively; Table 1).
Taxonomical diversity Taxonomical diversity refers to species composition and is useful to assess the conservation value of land uses as well as the role of diversified systems to increase the complementary and efficient use of resources. Most biodiversity indicators found in our study were related to taxonomical diversity and include indices of species diversity and composition. Some studies also use the combination of various indicators of taxonomical diversity and structure to calculate indices of overall diversity. Reported indicators were measured at field level (with one exception measured at landscape level, e.g. beta-diversity) and addressed a variety of organisms, including plants, microorganisms, bees, insects, birds, amphibians and mammals.
Structural properties and diversity The structural diversity of agroecosystems can be understood as variations in terms of size and number of individuals. Structural diversity is suggested to alter the capacity of agroecosystems to capture resources, such as water, carbon and light, and, in turn, ecosystem functioning (Ali et al. 2016). Moreover, the abundance of certain animal species is directly linked to the provision of ecosystem services. For instance, the species composition of pollinator species and natural enemies can, respectively, impact the levels of pollination and biological pest control. Indicators found in our review were used to quantify the number and abundance of organisms found in the systems as well as their size. Indicators related to size were mainly calculated for the vegetation at plant and field level, while indicators related to the abundance of organisms were mainly measured at field level and included not only plants, but also microorganisms, insects and macrofauna. Studies rarely calculated actual variation in structure (e.g. variation of tree height in a community).
Functional composition and diversity Functional diversity can be understood as 'the value and range of functional traits of the organisms in a given ecosystem' (Tilman 2001). There are two main ecological mechanisms suggested to explain the links between functional diversity and ecosystem functioning: the biomass ratio hypothesis and the niche complementarity hypothesis (Díaz et al. 2007). The biomass ratio hypothesis states that functional traits of the dominant species, measured as the community weighted mean (CWM) of individual traits, are of overriding importance for determining ecosystem functioning (Finegan et al. 2015). On the other hand, the niche complementarity hypothesis postulates that the variation and distribution of species trait values can influence ecosystem functioning by enabling better niche occupation and complementary use of resources (Faucon et al. 2017). Therefore, functional diversity is useful to assess both the functional response and effect of diversity on ecosystem functioning based on trait values and dominance (e.g. community weighed means -CWM) as well as trait variance (e.g. functional richness; Faucon et al., 2017;Lavorel, 2013;Wood et al., 2015). As an example, CWM values of leaf nitrogen content can help to understand both the effects of nitrogen fertilization on crop nutrition (Buchanan et al. 2019) and the consequences of leaf nitrogen concentration on the efficiency of nutrient cycling (Bakker et al. 2011). Our review did not capture any study that used functional diversity indicators for non-plant organisms. Most indicators were based on plant leaf traits and to a lesser extent flower, root and wood traits. In addition, most indicators were traits measured at plant level only, and were not scaled to field/community level.
Landscape diversity and integrity Indicators of taxonomical, structural and functional diversity can be scaled to landscape level (Lohbeck et al. 2017), although this was rarely observed in our set of studies. A limited number of studies assessed the impacts of coffee agroforestry on landscape diversity and integrity. Indicators related to landscape diversity and integrity found in our review addressed optimum land uses proportion in the farm or landscape, wildlife activity and landscape biological integrity.

Other categories
Socio-economic factors Ecosystem services, biodiversity and input use are closely related to socio-economic factors that are relevant for farmers and society, although this is not widely demonstrated in scientific studies (Heckwolf et al. 2021). Farm economic performance can be estimated based on production components benchmarks, such as cost per unit produced, in addition to market value, such as price premiums and conservation payments. The provision of ecosystem services can also be expressed in monetary values to highlight its economic relevance and integrate the calculation of economic indices (Bravo-monroy et al. 2015). Although the economic condition of farmers is linked to their social well-being, other social aspects are also relevant, such as gender relations, food security and sovereignty, and access to knowledge. Despite their importance to system sustainability, socio-economic indicators (4.3%) were less frequently reported than indicators of ecosystem services (57.2%) and biodiversity (35.6%). Yet, most indicators of socio-economic sustainability were associated to the economic performance of farmers. Indicators of economic performance were particularly related to costs and revenues and market values. Despite the great debate around monetarisation of ecosystem services (Costanza et al. 2017), we found very few studies that calculated the economic value of ecosystem services (e.g. pollination value -US$ ha -1 ). The number of studies that addressed food security were also surprisingly scarce, considering the relevance of the topic in global scientific and societal debates (FAO 2019). Nevertheless, food security was assessed in few studies based on the variety and value of products consumed by householders. The majority of indicators were calculated at field level. Therefore, most studies compare coffee fields to quantify the socio-economic impacts of coffee agroforestry and do not focus on comparing farms taking into account the multiple fields they may contain. In contrast, farm-level studies assess how changes in the proportion of agroforestry in the farm in comparison to monocultures are impacting socio-economic factors.
Input use Changes in ecosystem structure and diversity are associated to changes in input requirements (e.g. in terms of labour, weeding and use of fertilizers). At the same time, input use may have an impact on biodiversity and the provision of ecosystem services, and therefore, may be quantified to better inform integrated assessments (Teixeira et al. 2021). Despite the relevance of input use for farmers and system sustainability, and although input use indicators can be relatively easy to quantify, for instance, through farmers' interviews, a limited number of studies reported indicators in this category (2.3%). Indicators used to quantify the impacts of diversification on input use were linked to management requirements such as vegetation management, use intensity of fertilizers and pesticides, and labour input requirements. Input use was generally measured at field level.
Resilience In the context of agriculture, resilience can be understood as the capacity of farming systems to resist, recover and adapt to disturbances over time. Three main components of resilience can be highlighted (Meuwissen et al. 2019): robustness, the capacity of systems to resist stresses and remain with similar performance and characteristics; adaptability, the capacity of systems to rearrange their composition of inputs, production, marketing and risk management in response to stresses; and transformability, which entails significant changes in system structure and feedback mechanisms in response to severe disturbances that make the current system no longer feasible (Meuwissen et al. 2019). It is worth noting that resilience is closely related to the other sustainability dimensions, as it refers to system stability over time, in terms of ecosystem services provision, biodiversity conservation, input requirements and socio-economic factors (Reidsma et al. 2020). Although the concept of resilience is intrinsically linked to a temporal dimension, some authors propose the use of static measures, such as diversity, as a proxy for resilience (Mijatović et al. 2013; Andrade and Zapata 2019). The selection of static indicators, such as diversity, to quantify resilience implies assumptions, for instance, that diversity increases system resilience. Making assumptions to quantify resilience can be useful depending on the context and objective of the study, for instance, if the assumption is linked to the perception and knowledge of local actors. However, it is key to consider that diversity or ecosystem services per se do not equal to resilience (Reidsma et al. 2020), and that the design of sustainable systems may benefit from experimental studies that explicitly include the temporal component to demonstrate the mechanistic effects of diversity on resilience. In our study, a limited number of indicators addressed socioeconomic and ecological resilience (0.6%). Indicators were assessed based on farmers perceptions (semi-quantitative); field-level changes in crop yields and microclimate in face of climate change (based on long-term data); and model-based scenarios to assess crop suitability, yields and water regulation in response to climate change.

Methods and general characteristics of the studies
We identified a wide range of indicators that can be combined for quantifying the impacts of diversification on multiple dimensions of system sustainability. Most indicators are not restricted to coffee systems and can be useful to assess the impacts of diversification in other systems. In broader literature, there is increasing theoretical understanding and empirical evidence on the links between biodiversity and ecosystem services (Duru et al. 2015;Isbell et al. 2017b;Dainese et al. 2019), but ecological studies often focus on productivity as a specific measure of ecosystem functioning, overlooking the multidimensional nature of sustainability (Hector et al. 2010;Cardinale et al. 2013;Tilman et al. 2014). Furthermore, most ecological studies that explicitly assessed and quantified biodiversity-mediated mechanisms determining the functioning and sustainability of agroecosystems were conducted on grasslands (Hector et al. 2010;Tilman et al. 2014;Isbell et al. 2015), whereas these mechanisms have been less studied in other types of agricultural systems, especially ones that involve perennial crops such as agroforestry systems. Indeed, biodiversity-mediated mechanisms that regulate ecosystem functioning were often not explicitly evaluated in the studies found in our review, and most articles only addressed one dimension of sustainability, mainly focusing either on ecosystem services or biodiversity (56.7%; Figure 4). The use of a systemic framework associated with specific testable hypotheses (Giller et al. 1997) remains needed to unravel the causal relationships between diversity of organisms (e.g. plants, soil organisms, insects), agricultural management, resilience, socio-economic factors and ecosystem functioning (e.g. sustained soil fertility, nutrient cycling, climate regulation).
As observed in other studies related to the impacts of diversification, researchers often take either a production-oriented, agronomic approach (Hufnagel et al. 2020) or a biodiversity conservation, environmentalist, ecological approach (Haro et al. 2018). Naturally, methods also tend to follow this division. For instance, ecological and agronomic/bioeconomic models related to farming systems assessments are often not integrated (Chopin et al. 2019). Despite the clear division reported in literature, intersections between both approaches are also observed. For example, the concept of resilience has been used for decades in ecological research (Holling 1973) and is gaining increasing attention in agronomic studies (Ge et al. 2016;Meuwissen et al. 2019). The true integration of both perspectives requires combined efforts to build interdisciplinary research, but it can lead to a better understanding on the role of diversity for the agronomic and socio-economic sustainability of agroecosystems. For instance, Boreux et al. (2013) integrated measures of pollinator abundance and diversity, tree cover and density, crop production and management practices. The integration of agronomic and ecological variables revealed that increasing plant diversity can benefit bee conservation, pollination and crop production, but that combining adequate agronomic management practices remains crucial to achieve optimum production outcomes.
Only 41.4% of the studies focused in two or more dimensions of sustainability and only 19.5% considered the impacts of diversification on input use, socio-economic factors and/or resilience capacity (Figure 4). For instance, few studies consider the impacts of diversification on the use of pesticides and fertilizers (Jezeer et al. 2018) or resilience capacity against climate change (Gomes et al. 2020). Despite the relatively high number of studies on economic performance (e.g. Atallah et al. 2018;Cerda et al. 2020), studies focusing on social benefits associated with diversification remain scarce, which is probably related to the difficulty to quantify such benefits. The lack of studies that assess the impacts of diversification on input use, socio-economic factors and resilience is also observed in other types of systems (De Beenhouwer et al. 2013;Beillouin et al. 2019;Heckwolf et al. 2021). Therefore, including these dimensions of sustainability may be key to advance the design of diversified systems, since they have a high relevance to farmers (e.g. socio-economic wellbeing) and can inform the potential of diversified systems to cope with economic and environmental stresses (e.g. resilience to climate change).
Most of the reviewed studies focused on comparisons between two or more treatment groups (65.1%) and a limited number of studies considered the correlation among response variables (37.7%) or tested the bi-or multivariate relationship between explanatory and response variables (34.9%; Figure 5). Explanatory variables were either related to system diversity (e.g. tree species diversity) or other system component, such as environmental and socio-economic conditions. In our study, we focus on the impact of diversification on response variables, whereas the impact of other explanatory variables was not specifically assessed. Nevertheless, as most studies focus on comparing values between treatments, multiple dimensions of sustainability were often quantified as independent system components, rather than as a dynamic and complex network, where different components are interconnected and interdependent. In some studies, where the latter approach was taken, authors were able to go beyond the comparison among systems, towards the identification of specific bottlenecks for increasing the benefits of diversification as well as ecological processes and properties that can regulate the provision of ecosystem services valued by farmers and society. For instance, Jezeer et al. (2018) integrate crop yields, economic performance, input use and shade tree cover, to show that lower coffee yields associated with shade tree cover can be compensated through reduced costs and increased potential revenue of non-coffee products. In another example, Atallah et al. (2018) identified optimal tree shade cover levels to improve farm economic performance taking into account pest control services, crop growth services, and timber. Furthermore, specific statistical methods, such as structural equation modelling, can be particularly useful to highlight the interactions among system components. For instance, the use of struc- Figure 5. Impact of coffee diversification through agroforestry on ecosystem services, biodiversity attributes, socio-economic factors, input use and system resilience. The number of reported observations for each cluster is based on individual indicators. The responses are displayed as significant positive (green), negative (red) and neutral or variable (grey). Within the grey bar, the proportion of neutral responses is displayed with light grey while the proportion of variable responses is displayed with dark grey. Responses are considered variable if they depend on other factors than diversification, such as season and location. The grey part of the bar is centred on zero to enable a better visualization of the results. tural equation modelling in previous studies revealed a causal pathway of agroecological management leading to increased plant diversity and, in turn, maintenance or increase in soil quality and coffee productivity (Teixeira et al. 2021); and that excessive shade cover can have direct, but also indirect negative effects on coffee yields, due to the higher incidence of foliar diseases (Durand-Bessart et al. 2020).
Most studies reported quantitative data, and very few used qualitative scores and/or semi-quantitative data (6.5%). Future research may benefit from the use of qualitative or semi-quantitative scores to assess more implicit and abstract concepts, such as cultural services (Boerema et al. 2017), and to assess more dimensions of sustainability and resilience at once (Reidsma et al., 2020). The use of models to obtain estimations as a final outcome was also limited to a reduced number of studies (14%). Yet, specific studies show how models can be extremely useful to construct scenarios that allow to increase the spatial level of analysis from fields to landscapes as well as the assessment of system responses to socio-environmental changes. For instance, models were effective to estimate the provision of specific ecosystem services that have a large relevance at higher spatial levels, such as soil erosion control (Yustika et al. 2019). Models were also used to assess how diversification can affect system sustainability (e.g. crop suitability, yields and water regulation) in face of disturbances, especially climate change (Rahn et al. 2018b;Gidey et al. 2020;Gomes et al. 2020). In addition, models were helpful for assessing economic impacts of diversification (e.g. economic risk) at farm level (Reeves and Lilieholm 1993). 3.2 What is the impact of diversification through coffee agroforestry on multiple dimensions of sustainability?
The impact of diversification on sustainability is generally positive In accordance to broader literature on the impacts of diversification (Dainese et al. 2019;Niether et al. 2020;Wan et al. 2020), we found a general positive over negative impact of coffee agroforestry on most sustainability dimensions, including biodiversity conservation, ecosystem services provision, input use, socio-economic factors and resilience capacity ( Figure 5). Although the majority of studies are concentrated in Latin America, most global trends can also be observed at continental level (Appendix D). As expected, more diversified coffee agroforestry systems were associated with a higher diversity of non-plant organisms, as changes in plant species richness, structure and composition are suggested to impact adjacent trophic levels and cascade up to higher trophic levels (Scherber et al. 2010). More diversified systems were also reported to enhance the provision of multiple ecosystem services ( Figure 5). The positive effects were especially strong for the ecosystem services: climate regulation, soil erosion control, pest control and carbon sequestration ( Figure 5), which is in line with other studies that report benefits of diversification strategies (McDaniel et al. 2014;Soto-Pinto and Armijo-Florentino 2014;Gomes et al. 2016;Dainese et al. 2019;Tamburini et al. 2020). For instance, higher plant diversity can benefit trophic interactions among insects, leading to higher presence of natural enemies and increased pest control (Wan et al. 2020). Specific diversification strategies similar to coffee agroforestry have also been reported to increase the provision of ecosystem services. For instance, a recent review showed that cocoa agroforestry outcompeted monocultures in most indicators of sustainability related to total system yield, economic performance, potential for climate change mitigation, and biodiversity conservation (Niether et al. 2020).
More attention is needed to input requirements, socioeconomic factors and resilience Despite the limited number of studies, the impact of diversification on input use is suggested to be positive, since more diversified systems are often associated with higher provision of ecosystem services such as nutrient cycling and pest control, which can lead to better use of natural resources and reduced need of external inputs (López-rodríguez et al. 2015;Cerda et al. 2020;Teixeira et al. 2021). Nevertheless, studies suggest that diversified systems can increase the need for labour input, although this has been poorly tested with empirical studies (Bottazzi et al. 2020), including the ones found in our review. In terms of socio-economic impacts, a lot of attention has been given in literature to farm economic performance (Braat and De Groot 2012b) as well as food and nutrition security (FAO 2019), which is reflected in the type of socio-economic indicators reported in our review. Although lower yields were often associated with diversified systems in comparison to monocultures or less diversified systems, the impacts of diversification on farm economic performance were generally reported as positive ( Figure 5). This is because diversified systems may provide a variety of products beyond the main cash crop, increasing the total system yield and generating additional income (de Souza et al. 2012;Niether et al. 2020). Besides, as previously mentioned, diversified systems are often associated with reduced use of external inputs, which can reduce costs and compensate for lower yields (Gobbi 2000;Jezeer et al. 2018). In terms of social benefits, diversified systems often provide rural families with an increased variety of plant and animal products, which is suggested to increase food and nutrition security ( Figure 5; Cerda et al. 2020;Rasmussen et al. 2020). Yet, studies focusing on the impacts of diversification on food and nutrition security were scarce, which indicates the need to better integrate farmers and social benefits in future impact assessments. Finally, despite the low number of studies focusing on resilience, the impacts of diversification on system resilience have been generally reported as positive (Figure 5), which is in line with the 'diversity for resilience' rationale (Mijatović et al. 2013). However, there is still a very limited number of studies that assess the impacts of diversification on resilience (Dardonville et al. 2020) and the assessment of resilience through model-based scenarios is often not validated with experimental data. Therefore, more studies that include a temporal dimension are needed to enhance the understanding on how to optimize the impacts of diversity on ecosystem functioning and different capacities of system resilience. Good examples are reported in grasslands experiments, for instance, Isbell et al. (2009) empirically demonstrated how species diversity and interactions can favour biomass productivity and stability over time.
The need to balance trade-offs between yields and other sustainability dimensions Although the overall effect of diversification on system sustainability seems to be positive, there are also nuances that need to be addressed. Considering all dimensions of sustainability, at global level, the impact of diversification was more negative than positive in terms of crop productivity. Indeed, even though a positive effect of diversification on crop yields is broadly reported in scientific literature (Beillouin et al. 2019;Dainese et al. 2019), there is also evidence that yields, especially of the main cash crop, can decline in diversified systems (Letourneau et al. 2011;de Souza et al. 2012). In the end, yield responses are closely linked to the type of system and diversification strategy. For instance, a recent synthesis of 99 meta-analyses showed that agroforestry was the only diversification strategy (among seven) generally associated with yield reductions (Beillouin et al. 2019). Nevertheless, reduced yields of the main cash crop in agroforestry systems can also be compensated by reduced costs as well as other benefits including a greater variety of agricultural products (de Souza et al. 2012;Jezeer et al. 2018). The impact of diversification on yields can also vary according to the type of ass o c i a t e d m a n a g e m e n t p r a c t i c e s . F o r i n s t a n c e , diversification practices are suggested to be particularly useful to reduce the organic to conventional yield gap in organic farming systems (Ponisio et al. 2015).
Despite the general association between yield reduction and diversification in our review, we also reported cases where it was possible to maintain or even increase crop productivity in diversified systems compared to less diversified ones (Jaramillo et al. 2013;Nesper et al. 2017;Rahn et al. 2018b;Acosta-alba et al. 2020;Gidey et al. 2020). Contrary to the global trend, positive effects of diversification on agricultural production were reported more frequently than negative ones in Asia and Africa. This is probably because many studies in Latin America compare coffee yields in intensive full-sun systems with shaded systems, while the few studies in Africa and Asia focused on other variables than coffee yields and considered a gradient of shade tree diversity instead of two (very contrasting) treatments. In general, the reported positive impacts indicate that considering local environmental conditions (e.g. light irradiance) as well as adequate management (e.g. optimal shade tree cover and species number) is key to balance trade-offs, which can lead to satisfactory results in terms of the production of coffee and other crops (Soto-Pinto et al. 2000;De Beenhouwer et al. 2013;Cerda et al. 2017;Durand-Bessart et al. 2020). Farmers are highly aware of trade-offs between crop productivity and other ecosystem services, which influence their choices to e.g. select trees and apply inputs (Cerdán et al. 2012;Valencia et al. 2015). Therefore, it remains crucial to better engage with farmers and learn with successful cases to inform the design of more sustainable and diversified systems through feasible agronomic practices as well as adequate system configuration and structure.
Drivers not related to diversity and integration level of analysis Even though the impacts of diversification are often more positive than negative, in some indicator categories, a large proportion of responses are variable or neutral, which means that there are other factors that can interact with responses to diversification. For instance, variable and neutral responses are largely reported for soil chemical and physical quality. Therefore, it is crucial to account for environmental variance among farms/fields, especially for soil-related variables, that are strongly influenced by environmental conditions, like soil type, slope and position in the landscape. The large proportion of variable responses also reinforces the need to consider external factors and their potential interactions with a particular diversification strategy (Atallah et al. 2018). In other words, a particular diversification strategy may work better under some conditions than others, and such interactions need to be considered in designing diversification. Finally, it is key to identify the integration level in which ecosystem services and drivers of ecosystem services interact. For instance, although pollination is suggested to be closely linked to diversification, there were few studies that reported positive impacts of diversification through coffee agroforestry on pollination. One of the issues may be that drivers of insect-mediated ecosystem services, such as pollination, play a more important role at larger spatial levels (i.e. landscape level; Kebede et al. 2019;Coutinho et al. 2021), while studies often focus on drivers at field level. Therefore, the use of landscapelevel explanatory variables, such as proportion of coffee agroforestry on the landscape or gamma diversity of tree species found in coffee agroforestry systems within different landscapes, can be useful to further understand the impacts of diversification on services such as pollination and pest control.

General recommendations
Combine a wide range of indicators We suggest that quantifying multiple dimensions of sustainability with the use of a wide range of integrated indicators can help to inform the design of complex and sustainable agroecosystems. Specific indicators selected based on a wider framework can be systematically applied to monitor the impacts of diversification, inform the development of policies and extension services, help farmers and companies to adopt adequate management practices, and ultimately, enhance our understanding on ecosystem functioning to increase overall system sustainability. It remains important that indicators are linked to well-defined dimensions of sustainability that enable a thorough and systemic understanding of ecosystem functioning. In many cases, it is not possible for a single experimental study to quantify all proposed dimensions of sustainability. Therefore, researchers should consider carefully which specific sustainability dimension(s) they focus on, which sustainability components are missing and how the influence of other drivers than diversification is controlled. Although indicators are embedded in a wider framework, their selection should account for the local context and well-defined challenges pertaining to agriculture (problem definition).
Acknowledge systems in different transition stages Although the definition of sustainability goals and the quantification of associated indicators is key to inform and promote the transition to diversified systems, the interpretation of results should take the starting point of transition into consideration. For instance, reducing the dependence on external inputs is considered beneficial from a sustainability perspective. However, if the initial stage of transition is a field with very low nutrient availability, providing external inputs can be necessary (especially in the short-term) to improve other aspects of system sustainability, such as crop productivity and carbon storage, and thus overall system sustainability. This reinforces the need to acknowledge systems in transition as well as trade-offs among sustainability objectives during processes of transition (Kearney et al. 2019;Dumont et al. 2021), Focus on underreported groups of indicators and integration of sustainability dimensions Many groups of indicators feature in a minority of the studies included in our review, including input use, soil biological quality, pollination, water regulation, functional diversity, socio-economic factors and resilience capacity. In addition, most individual studies addressed only one or few dimensions of sustainability, and did not focus on the potential synergies and trade-offs among variables at system level. These trends are also reported in the wider literature and are not restricted to coffee systems (De Beenhouwer et al. 2013;Beillouin et al. 2019). Therefore, future research should not only include underreported groups of indicators, but also focus on the integration of multiple sustainability dimensions. In addition, it remains relevant to make explicit why and how assumptions are used in the selection of indicators to quantify sustainability dimensions.
Better quantify the links between functional diversity and ecosystem services Functional diversity is suggested as a key diversity attribute to better explain changes in ecosystem properties and processes (Díaz and Cabido 2001;Wood et al. 2015). Yet, although some studies reported the use of functional traits as indicators, these traits were generally not scaled to community level. Therefore, traits were often not linked to the provision of ecosystem services and functions, although this approach is relatively well-established in natural ecosystems (Bakker et al. 2011;Lohbeck et al. 2012Lohbeck et al. , 2015Finegan et al. 2015;Teixeira et al. 2020). Besides, the functional diversity approach can help to bridge scientific and local knowledge as farmers use plant traits to take management decisions (e.g. application of fertilizers) as well as to select adequate shade tree species (Isaac et al. 2018). In future studies, the use of functional composition and diversity indices in combination with measures of ecosystem services remain key to better understand and explain the impacts of diversification, which so far remains poorly demonstrated with empirical data in agricultural systems, especially in the tropics.
Models are useful but need to be properly evaluated and coupled with field data The combination of different types of models, such as spatially explicit models, biodiversitybased mechanistic models and bioeconomic models, is a promising approach to integrate multidimensional and systemic analyses at higher spatial levels, especially when linked to empirical field data (Chopin et al. 2019). Modelling approaches are particularly useful to fill knowledge gaps found in literature, such as the need to increase the scale of analysis (i.e. from fields to farms and landscapes) and to assess how systems respond to disturbances, such as climate change. Models are also effective to assess the impacts of diversification on crop production and farm economics, including the analysis of future scenarios. Yet, models need to be properly evaluated with model uncertainty and accuracy assessments (Batista et al. 2019). Besides, field data is required to parameterize biophysical models; otherwise, there is the risk to become too dependent on key assumptions that do not necessarily represent the local reality. Adequate management of system diversity is key to balance trade-offs Although the impacts of diversification are generally reported as positive, diversification strategies are still not widely adopted. Therefore, more attention is needed regarding potential constraints, such as labour input and crop productivity. Besides, ecosystem properties and structure associated with a specific diversification level will determine whether, and to what extent, system sustainability is related to diversity. Therefore, adequate management of system diversity (e.g. optimal shade tree density) is key to balance trade-offs and lead to satisfactory results in terms of crop production and other ecosystem services and dimensions of sustainability. For instance, authors suggest that shade levels should not exceed 50% to enable a satisfactory coffee productivity (Soto-Pinto et al. 2000). Moreover, one could also say that most studies give a similar weight for different sustainability indicators. But in the farmer's perspective, productivity of the main cash crop may weigh more heavily than, for instance, increases in biodiversity or carbon sequestration. One needs to be aware of how farmers weigh indicators and what is their perception on the importance of contrasting ecosystem functions and services over time (e.g. are they thinking in terms of long-or short-term benefits). Finally, it remains key that future studies go beyond ecology and agronomy to also consider policies and programs (e.g. payment for ecosystem services), social behaviours and other factors that may influence the adoption of diversification practices.

Conclusion
This is the first review study to our knowledge to provide and operationalize a systemic framework to quantify the impacts of diversification on multiple dimensions of sustainability. We showed that indicators used to quantify the impacts of diversification can be clustered in five main sustainability dimensions: biodiversity, ecosystem services, socio-economic factors, input use and resilience, to derive a comprehensive and systemic perspective on ecosystem functioning.
In coffee agroecosystems, the impact of diversification was more positive than negative in all dimensions of sustainability, except for crop productivity. Yet, diversified systems can achieve higher yields with appropriate agricultural management, including the number and type of shade trees selected per area. Besides, lower yields may be compensated due to reduced economic costs, as diversified systems are often associated with lower use of external inputs. Key to reaping the benefits of diversified systems is that the diversity of elements is carefully integrated considering the multiple interactions among ecosystem services and other system components and sustainability dimensions. A better understanding of synergies and trade-offs remains crucial for the customized design of diverse and sustainable systems for a variety of geoclimatic conditions.
While our review provides insights regarding the impact of diversification on sustainability, current methods and studies lack an integrated and interdisciplinary approach, and the focus should move from comparing contrasting systems (e.g. full-sun vs agroforestry) to considering the interaction among variables and system components. In addition, indicators are mostly measured at field level, whereas a landscape approach is especially relevant for understanding the regulation of ecosystem processes such as pest control, pollination and soil erosion. As a consequence, the mechanisms that regulate ecosystem functioning are still not widely understood. There are also gaps in terms of the location of studies and type of indicators used. Relatively very few studies were conducted in important coffee-producing countries such as Vietnam, Indonesia, Ethiopia, Uganda and Peru. At the same time, indicators linked to pollination, water regulation, functional diversity, landscape diversity and integrity, social factors, input use and resilience are often underreported and need to be better explored in future studies.
Results are presented at global and continental level. The number of reported observations for each cluster is indicated by the number in brackets based on individual indicators. The responses are displayed as significant positive (green), negative (red) and neutral (light grey) or variable (dark grey). Responses are considered variable if they depend on other factors than diversification, such as season and location.
Code availability Not applicable.
Consent for publication Not applicable.

Conflict of interest The authors declare no competing interests.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.