1 Introduction

With a global population set to exceed 9 billion people by 2050, food production will need to continue expanding within the limitations of increasing natural resource scarcity and challenges posed by climate change (Opio et al. 2013). Global demand for milk is forecasted to grow at a rate of 1.1 percent per year in the coming decades (Opio et al. 2013). However, the dairy sector is a large contributor to global gaseous emissions accounting for approximately 30 percent of greenhouse gas (GHG) emissions from the livestock sector (4.6 gigatons CO2-eq, Opio et al. (2013)). In this context, dairy production faces sustainability challenges, such as an increasingly competitive global market and economic viability, high price volatility, and animal welfare concerns (Gerber et al. 2013), as well as growing social pressure on farmers (Dubois 2016). It is therefore necessary to pay greater attention to resolving sustainability issues. In that regard, it is essential to identify farming practices and technologies that can lead to desired sustainability outcomes. This requires a robust definition of agricultural sustainability, as well as holistic assessment tools able to account for the wide diversity of dairy production systems (DPSs) (Diaz de Otálora et al. 2022) across Europe and globally while maintaining flexibility to avoid favoring certain systems over others.

Over the years, the literature has reached consensus over the three dimensions of the sustainability concept (i.e., economic, environmental, and social). This has contributed to a clearer definition of sustainable agriculture, such as the one proposed by Lewandowski et al. (1999), and adopted in this article, namely, “management and utilisation of the agricultural ecosystem in a way that maintains its biological diversity, productivity, regeneration capacity, vitality, and ability to function, so that it can fulfil – today and in the future – significant ecological, economic and social functions at the local, national and global levels and does not harm other ecosystems” (page 185). Despite advancements in the literature to develop integrated methods for sustainability assessment (FAO 2014; Marchand et al. 2014; Grenz et al. 2009; Zahm et al. 2008), complexity and difficulties in assessing and measuring farm sustainability are still a challenge (Diaz de Otálora et al. 2021; Soulé et al. 2021). Due to the wide scope of the sustainability concept, the combined appraisal of various sustainability aspects is needed. This requires specific methodological approaches, data, and expertise from different scientific fields (Clay et al. 2020; Gésan-Guiziou et al. 2020). When evaluating DPSs, most of the studies have focused mainly on one sustainability dimension (Segerkvist et al. 2020; Lebacq et al. 2013) which does not provide a holistic view of dairy farm sustainability. In fact, some studies examined either economic sustainability (e.g., Borawski et al. (2020)), environmental sustainability (e.g., Acosta-Alba et al. (2012)), or social sustainability (e.g., Boogaard et al. (2011)), while few studies looked at the overall sustainability of dairy farms (Meul et al. 2008; Bélanger et al. 2015) and even fewer with a specific focus on the sustainability of GHG reduction practices. Consideration of only one dimension does not display the potential trade-offs and synergies that could occur across the multiple dimensions (Clay et al. 2020) and can thus lead to conclusions that would negatively affect the two others. Additionally, so far, the social dimension has been underrepresented in sustainability assessments and, more generally, in the literature (Lebacq et al. 2013). Most of the time, studies addressing social sustainability only refer to on-farm working conditions (Chopin et al. 2021) and/or animal welfare (Tallentire et al. 2019) and are seldom farmer centered. This highlights the need to rebalance the three dimensions in our understanding and application of the sustainability concept.

The challenge of assessing agricultural sustainability is to represent the many processes involved by comprehensively covering as many time and space scales as possible (Laborte et al. 2007; Lairez et al. 2015). Therefore, assessments have to be multidisciplinary and multifaceted, which requires an approach based on scientific knowledge and the use of objective indicators (Marchand 2010). To identify key factors affecting the sustainability of DPSs and handle multidimensional constraints, multicriteria assessments (MCA) are seen as suitable methodological tools due to their holistic and iterative approach (Craheix et al. 2015; Lairez et al. 2015; Angevin et al. 2017). MCAs decompose problems into a series of quantitative or qualitative indicators to integrate objective measurements and value judgments into decision-making. In this way, they make subjectivity explicit and manageable through organized inputs, allowing groups or individuals to explore complex questions of interest and take more informed decisions (Belton and Stewart 2002). To evaluate themes and sub-themes of a different nature and their interactions, MCAs may require transformations of the different indicators under study through various steps, including weighting and aggregation. The number of themes and sub-themes and how they are transformed can vary by MCA methodology and context, but it is often recommended to organize them hierarchically from general to specific considerations (Lairez et al. 2015).

In this article, we contribute to the literature by developing an ex post MCA model to evaluate the sustainability of DPSs with equal consideration of the three sustainability dimensions. Our model is based on the DEX methodology implemented in the DEXi software (Bohanec 2020) and will be hereinafter referred to as DEXi-Dairy. We focus specifically on the DEX MCA methodology to build our model because: (i) It follows a participative and multidisciplinary approach and uses a wide panel of expertise to facilitate the consideration of context specificities and system issues (Angevin et al. 2017; Pelzer et al. 2012) and (ii) it is able to transform quantitative into qualitative data, thereby allowing for their aggregation into simple qualitative scores. It also allows for trade-offs between accuracy of input variables and a large number of assessed impacts, thus facilitating the design and evaluation of complex systems, and (iii) it is adaptable and has the capacity to depict farm heterogeneity. Furthermore, the DEX method has been widely applied to develop multi-attribute decision-making models in different agricultural sectors as a means to face multidimensional problems in sustainability assessments. For instance, several models assessing sustainability in agri-food systems and based on the DEX methodology were created for cropping systems (Sadok et al. 2009; Bohanec et al. 2008; Colomb et al. 2013; Craheix et al. 2012; Pelzer et al. 2012; Vasileiadis et al. 2013), viticulture systems (Metral et al. 2015), orchards systems (Mouron et al. 2012), vegetable cropping systems (Rezaei et al. 2018), and aquaculture systems (Le Féon et al. 2021).

The objective of this article is first to present the method followed to build the new DEX-based model for DPSs, i.e., the DEXi-Dairy model, and then describe it in depth. Supporting methodological documents (including sustainability indicator handbook, data collection templates and guide) are available in a public repository (https://entrepot.recherche.data.gouv.fr/dataverse/Eragas_Milkey) to allow practitioners to use, replicate, and expand our model. The DEXi-Dairy model can be used by researchers, agricultural advisers, and technicians to assess the sustainability of any specialized dairy farm. Additionally, it can be used to test the effect of various GHG mitigation options on the three sustainability dimensions for European DPSs. Overall, this highlights the wide range of applications of the DEXi-Dairy model, including the analysis of potential sustainability enhancing solutions before encouraging their farm-level uptake or the comparison of sustainability performance across multiple DPSs. To demonstrate its use, we test the model with data from seven farms distributed across Europe.

2 Materials and methods

2.1 DEX method background

DEXi is a software dealing with multi-attribute decision-making. Its objective is to help design qualitative multi-attribute decision models by supporting interactive and participative development. The software displays and assesses the different systems selected. The DEXi software breaks down a decision problem into several sub-problems, which are easier to solve than the primary one (Bohanec 2020). It associates a hierarchical decision model with an expert perspective based on qualitative attributes. The attribute values are discrete and expressed with qualitative statements such as “low, medium, high” rather than numbers (Craheix et al. 2015). These attributes are organized in a tree shaped structure, building dependencies between attributes of different levels. Each decision is represented by a set of attributes that are assessed individually and then aggregated by the model up to the next levels. For the sake of clarity, each dimension (environmental, economic, or social) is divided into themes (third level), further divided into sub-themes (intermediary level), which are themselves assessed by indicators (lowest level). Indicators are directly assessed by the users through models (complex or operational) or direct measures (Lairez et al. 2015). In this way, each dimension gathers several themes, and their respective sub-themes and indicators (Fig. 1) except for the social dimension which was only composed of themes and indicators in this article.

Fig. 1.
figure 1

Organization of a decision rule-based, hierarchical multi-attribute model for a theoretical example of one theme (adapted from Bohanec (2020 and Craheix et al. (2015)) with X1..n, list of inputs to calculate the indicators; X1..n, indicators; F(X,Y): the utility function; Y1..n, sub-themes; and Z1..n, themes. Between each level, a weight in percentage is attributed to each sub-attribute

A five-step conceptual framework for building a DEXi model was proposed by Craheix et al. (2015) and recently refined by Le Feon et al. (2021):

  1. i)

    First, an analysis and design plan is established. It constitutes one of the most important steps of the MCA model, where assessment boundaries, objectives, actors, and research institutes involved are thoroughly defined (Bohanec 2020).

  2. ii)

    In the second step, a hierarchical tree is built. It can be done with two different approaches, i.e., top-down and bottom-up. In the top-down approach, the design starts from the root of the tree (the upper part: the “dimension” level) to the leaves (the lowest part: the “indicator” level). The tree is built by a panel of expert based on the objectives of the assessment and the context of the targeted system defined in step (i). Alternatively, the bottom-up approach starts from the selection and design of indicators and aggregates them to the superior levels (sub-themes, themes, and then dimensions). This approach is participative, involving strong interactions among different stakeholders

  3. iii)

    Next, indicators have to be selected and/or designed for each sub-theme. The quantitative indicators, based on data collection, calculation, or model estimations, are transformed into qualitative values (named scales), such as “high, medium, low” (Fig. 1). In the DEX methodology, this step is named “scaling” (Bohanec 2020). It is usually done using threshold values from the academic and policy literature. Since these thresholds highly influence the qualitative value taken by indicators, one of the main challenges is to find the best compromise between result accuracy and simplicity of model use (Craheix et al. 2015). It is not recommended to associate too many indicators with too many qualitative scales. For each indicator, the number of classes on the scales can go up to seven, but it is advised to keep this number as low as possible. It is then recommended to increase the number of classes in the superior levels (Bohanec 2020).

  4. iv)

    The next step is the parametrization process, i.e., building the decision rules between lower levels (X,Y) and their corresponding upper levels (Y, Z) (Fig. 1). This is presented as a key step in the DEX methodology, where the attributes are aggregated to the next attribute level. The aggregation is done by a utility function (F(X,Y)) and a weighting procedure (Fig. 1). Concretely, a weight is attributed to each attribute, and a utility function is defined. This function takes the shape of a lookup table using “IF-THEN” decision rules which are set to fix the value of an aggregated attribute depending on the value of the immediate descendant attributes (e.g., IF indicator X1 is “high” AND indicator X2 is “low” THEN sub-theme Y1 is “medium”). The maximum number of indicators to aggregate is four in the DEXi software, but it is advised not to combine more than three indicators (Bohanec 2020).

  5. v)

    The last step consists on the evaluation and interpretation of the results.

The development of DEXi-Dairy is detailed in the next sections.

2.2 Development of the DEXi-Dairy model

Figure 2 details the framework used to develop the DEXi-Dairy model. It presents the different steps followed in this study and situates them in relation to the general DEX method described in Section 2.1.

2.2.1 Definition of the working groups

The DEXi-Dairy hierarchical tree was created following a bottom-up approach through eighteen participative workshops. Different researchers and agricultural advisers in dairy production, GHG emissions, economy, sociology, and environmental sciences were involved in these workshops. They were selected from a wider consortium based on their expertise. Three groups were defined to work on each specific dimension of the dairy sustainability assessment (Fig. 2, (1)). Each group was composed of 8–12 experts; 3 of them attended all the workshops to check the overall consistency of the tree. The groups defined indicators, sub-themes, and themes for each dimension and developed the parameters used to connect them, as described in more detail below. At each workshop and for each dimension, the latest version of the tree was presented. Discussions on potential modifications were held, until an agreement was reached. Then, objectives for the next workshop were determined.

Fig. 2.
figure 2

Framework to design the DEXi-Dairy model: (1) to (9) refer to the different steps followed to design DEXi-Dairy, while the boxes and numbers in red correspond to the conceptual framework proposed by Craheix et al. (2015) and adapted by Le Feon et al. (2021)

2.2.2 Definition of the indicator library

Based on the literature and previous project reports, notably from the IDEA, MASK, and SIMTAP projects (Bergez 2013; Le Féon et al. 2021; Zahm et al. 2008), an initial list of indicators was established for the environmental and economic dimensions. Then, a first version of the hierarchical tree was built taking into account the list of indicators, their inputs, methods, and assessment scales, as well as the study’s objectives (Fig. 2, (2)). This preliminary hierarchical tree was proposed to the working groups for discussion. Specifically as follows.

Environmental indicators were retrieved from technical reports ((Koch and Salou 2015; Pellerin et al. 2013)) and the life cycle assessment (LCA) methodology (ISO 2006). LCA was chosen to be consistent with the recommendations of the Food and Agriculture Organization of the United Nations (FAO) (FAO 2016) and the European Union (EU) (European commission 2013) concerning the environmental impacts of livestock production systems. The approach was then completed by the BIOTEX method, proposed by Manneville et al. (2014), to cover impacts on biodiversity.

Economic indicators were retrieved from a combination of sources including EU Farm Accountancy Data Network (FADN) reports and technical documents (from the European Commission (European Commission 2018ba2020) and the Teagasc National Farm Survey (NFS) 2019 sustainability report (Buckley and Donnellan 2020).

In the social dimension, indicators were adapted from the results of qualitative research and a literature analysis. Firstly, a literature review regarding the social sustainability of agricultural production systems was conducted to define the theoretical framework of the assessment. As a consequence, the themes were determined using Parsons’ social system theory (Turner 1991), and indicators were derived from Maslow’s concept of needs (Maslow 1943). Needs considered in this study included the following: (1) physiological and security needs, (2) social needs, and (3) esteem and self-actualization needs. Secondly, after selecting the theoretical concepts to be included in the social branch (Baillet et al. 2022a), 15 unstructured in-depth interviews were carried out with farmers from Norway, Poland, and Ireland during fall 2021. These were based on a methodology focused on discovering theory through data, firstly published by Glaser and Strauss (1967) in a book entitled The discovery of Grounded Theory and well-established in social sciences.. The aim was to examine what selected concepts entailed for farmers and identify statements that could represent them. The grounded theory is based on the assumptions that social reality is best understood by the actors involved in it, and the researcher goes into the field without pre-conceptualized categories of analysis and indicators. Purposive sampling was conducted for this part of the research. Specifically, participating farms had to be family owned and operated and specialized in milk production, with more than two-thirds of their standard output coming from the dairy enterprise (Eurostat 2022). In twelve cases, the interviews were conducted with the main owner of the farm. In three cases, married couples managing the farm participated in the interviews. Three women and twelve men participated in the interviews. According to the grounded theory, research material was collected as part of the interviews and observations in subsequent interactions. After conducting the interviews, the interviewers prepared their summaries, and then the team of sociologists started a series of workshops to develop the content of the indicators. The analytical process of the data collected during the interviews included data reduction, data display, and the derivation/confirmation of conclusions (Silverman 2016). By applying a comparative method covering all in-depth interviews, categories of analysis were generated. In conjunction with the established social framework, affirmative statements were developed and became the basis for the social questionnaire implemented on case study farms. Statements were used to populate the indicators of the social branch. Notably, they measure farmers’ level of (dis)agreement with various social issues identified in the interviews on 5-point Likert scales. They were structured based on Maslow’s concept of needs (Maslow 1943) to generate three social sustainability indicator indexes. Continuous values taken by the indexes were obtained by averaging responses to the set of statements grouped under each of these three need categories.

After indicator identification, the list of all sustainability indicators was refined and restructured by each working group through successive review workshops based on the relevance of each indicator/sub-theme/theme for the goal pursued by the MCA (Fig. 2, (3)). For each dimension, working groups were asked the following questions: (i) “Are the indicators/sub-themes/themes relevant for assessing DPSs and sensitive to the implementation of GHG mitigation options?”. If deemed relevant, indicators/sub-themes/themes were classified into “major issue,” “important,” and “not so important,” with two follow-up questions: (ii) “Are there any additional indicators missing?” and (iii) “do the names need to be refined?”. After the first workshop, two series of additional questions were added: first, on the tree structure, “Do we validate the hierarchical connections among indicators, sub-themes, and themes?”, and second, on the definition of the indicators, (i) “Are the data, scales, and thresholds easily available for this indicator?” and (iii) “Is the indicator likely to be sensitive to differences among DPSs?”. Then, each branch of the tree was modified accordingly. At the end of the iterative review process, the final tree was validated by all project partners (Fig. 2, (4)).

2.2.3 Definition of number of classes and scales

In order to transform quantitative indicators into qualitative ones (scaling procedure from “(very)low” to “(very)high” (Fig. 1 and Fig. 2, (5))), the number of classes and lower and upper bounds of each class were determined. Although the number of scales per indicator can go up to seven, we followed recommendations to have as few as possible (Bohanec 2020). In DEXi-Dairy, continuously defined environmental indicators had between three and five classes, economic indicators had four classes, and social indicators had tree classes. Of all the sustainability indicators used in this study, only one was qualitative, i.e., participation in an agri-environmental scheme (environmental dimension). For this indicator, two classes were used in accordance to its dichotomous nature. As for the upper levels of the tree (i.e., sub-themes, themes, and dimensions), the number of classes was set to the minimum, as suggested by Craheix et al. (2015). However, it was at least as high as the number of classes of their corresponding lower levels, i.e., the number of classes of sub-themes Y1 cannot be less than the number of classes of the indicators X1, X2, and X3 (Fig. 1). Following the recommendations of Carpani et al. (2012), random values were attributed to the indicators using the Monte-Carlo method, and different numbers of classes were assigned to their upper levels. The distribution of random scenarios was observed, and then the best combination of classes was assigned for the hierarchical tree through observation plots and normality tests of Kolmogorov-Smirnov (KS).

By nature, indicator classes are ordered and bounded by threshold values, derived from literature-based (also called reference values, as defined by Lebacq et al. (2013) and Acosta-Alba et al. (2012)) or data-driven Before describing the choice of lower and upper bounds in detail; it is worthwhile to mention that in the context of this research, DEXi-Dairy was implemented to compare farms representative of key national production systems from different European countries. As underlined by Le Féon et al. (2021), comparing production systems at European level requires the use of common scales even if sustainability performance, in particular in the economic and social branches, can differ across regions. Hence, the choice was made to develop common, non-country specific scales to compare performance among case study farms. Due to the flexibility and adaptability of the DEXi-Dairy tool, indicator scales (including number of classes and/or threshold values) could be adjusted to reflect more specific national or regional conditions in future work. Specifically, our thresholds were defined using a three-step approach. In the first step, reference values were selected for all sustainability indicators. For the environmental indicators, these were derived from the literature, except for one indicator (percentage of cultivated area irrigated). In this latter case, reference values were defined using the data collected on case study farms. For most economic indicators, reference values were based on the distribution of data of specialized dairy farms from the EU FADN public database, where information from all EU Member States was used for the years 2016–2018 (European Commission 2019). For two of the economic indicators (loan repayments per farm gross margin and ratio of dairy and dairy animal sales to total sales), the necessary data to scale them through the EU FADN was not available. In such cases, reference values were defined using the data collected on case study farms. As for the social indicators, reference values were defined using cut-off values determined through expert opinion. Notably, because these indicators were obtained by averaging responses over 5-point Likert scales, where points took values between 1 and 5, two cutoff values were chosen as follows: 1.5 and 3.5. In the second step, the suitability of selected reference values to determine threshold values was assessed by project partners. Based on expert opinion and if/where needed to better reflect the specificities of national production systems, reference values were adjusted to form suitable thresholds. In the third step, the ability of selected threshold values to discriminate farms against each other was tested by populating the DEXi-Dairy model with case study data. A final adjustment to threshold values was made if/where needed to better differentiate farms based on their sustainability performance.

Please note that the list of references for reference values is provided in supplementary material A1. Additionally, more detail about indicator definitions, formulas, and scales (including final threshold values) is available in Baillet et al. (2022a).

2.2.4 DEXi-Dairy parametrization

To build decision rules among levels of the DEXi-Dairy tree in the aggregation step, it was necessary to define weights and utility functions. Weights (aggregation factors) were determined by project partners through online surveys (Fig. 2, (6)) and then adjusted through an optimization process. In a first step related to the elicitation of the survey weights, at least two participants per partner institute responded to the surveys, where they were individually asked to weight tree nodes against each other at each hierarchical level. Weights had to add up to 100% at each level. When the pool of obtained weights were not too divergent, the mean value was selected to be used in the final DEXi-Dairy tool. Conversely, when the weights were divergent, a workshop in each working group was carried out in order to find an agreement over the most suitable value, as suggested by Le Féon et al. (2021). In a second step related to the optimization process, for each aggregation node, the utility functions were calculated by the DEXi software using a multiple linear regression formula (y = a x + b), where the survey weights of each indicator from the previous level (X1, X2, X3 or Y1, Y2) and the number of classes of the following level (Y1 or Z1) were used to calculate a and b (Fig. 1). Then, the DEXi software rounded up the value of y by adjusting the weights of each indicator from the previous level (X1, ...). We used these adjusted weights proposed by the model to generate all the utility functions of the DEXI-Dairy model. The generated matrix was then exported and qualitative classes were transformed into quantitative classes. The mean of each aggregation rule was calculated and ranked. The matrix was then checked line by line for consistency and corrected if the results deviated from expert knowledge as suggested by Le Féon et al. (2021) and Soulé et al. (2023) (Fig. 2, (7)).

2.2.5 Data collection

Twelve case study farms were identified by project partners to represent key national dairy production systems in Ireland, Germany, Greece, France, Norway, and Poland. To participate in this study, farms had to be family owned and operated, and specialized in milk production. While the environmental and economic data could be collected with the help of multiple farm workers, the farm’s owner and main decision-maker was to answer the social survey. Participation in the study was voluntary and farmers’ consent was obtained at the beginning of the data collection process. The data collection started in the fall of 2021 and was initially concerned with obtaining environmental and economic data for the 2020 accounting year. It ended with the social study carried out in February and March 2022. The data was collected by data recorders from project partner institutes through farm visits and phone interviews. In total, the data collection process took about three hours per farm.

To help data recorders in the collection process, support tools were created, including four data templates in Excel format for the environmental and economic farm data (Fig. 2, (8)) (Baillet et al. 2022b), a data collection guide with a detailed description of all necessary environmental and economic information (Baillet et al. 2022c), and a social questionnaire. The first data template was dedicated to general farm information (infrastructure, manure management, water and energy consumption), crop production (crop management sequence of arable crops and grassland), and biodiversity (habitat diversity). The second template included information regarding dairy herd management practices (herd inventory, diets, milk production, and manure management) and the description of the milking parlor. The third template gathered beef herd management practices. As for the economic data, a specific template was created including information regarding labor, costs, earnings from crop and livestock production, animal product sales and services, other gainful activities, and subsidies. The social questionnaire consisted of eight parts, with seven parts containing statements about the fulfilment of farmers’ needs related to farm work (job security, physical strain, risk of accidents, work support, financial security, social contacts at work, job satisfaction, career plans, investments) and to life outside of the farm (free time, access to services and institutions, social contacts in private life, integration into local community, loneliness, self-actualization), and a part with socio-demographic characteristics. Data recorders were trained through a series of three workshops regarding the collection of environmental and economic data and the principles of social research.

2.2.6 Description of the case studies

Seven contrasted case studies are presented in this study (Table 1) to demonstrate the application of the DEXi-Dairy tool. These case studies represented a wide range of climatic conditions (average temperatures between 3 and 12 °C, and annual precipitations from 490 up to 1614 mm), herd size (from 24 to 185 dairy cows), and length of grazing season (from 0 to 259 days). As for the grassland area, it varied from 38.8 to 100% of the total utilized agricultural area (UAA). A description of selected farms is provided in Table 4 and is completed in Balaine et al. (2023)1. They were located in the North (FR01) and South-East (FR02) for the French case studies, in the North-West (DE01) and North-East (DE02) for the German case studies, in the South (IR01) and West (IR02) for the Irish case studies, and finally in the Nordland region (NO01) for the Norwegian case study. At the time of the study, the main farm owners were men in six out seven case studies. Farmers were aged 37 to 69, mostly with formal agricultural education (six out of seven farms). In two cases, the main farm holder had an off-farm job or received a pension.

Table 1 Mean characteristics of the case study farms in the four studied countries (FR France, IR Ireland, DE Germany, NO Norway)

2.2.7 Indicator calculation

Most of the environmental indicators are based on the LCA methodology according to the ISO standards (ISO 2006). The system boundaries considered in this study were from cradle to DPS gate and included the production of capital goods (infrastructure and equipment), the production of chemicals such as fertilizers, the production of feeds and concentrates, farm operations (including local emissions of pollutants), the use of energy sources, and transportation. The functional unit was 1 kg of fat-protein-corrected milk (FPCM, 4% fat and 3.3% protein as recommended by Koch and Salou (2015)). Concerning the life cycle inventories (LCIs), on-farm emission calculations were based on guidelines proposed by Koch and Salou (2015), as implemented in the MEANS IN-OUT online platform (https://pfmeans.inra.fr/means/login.jsp, Auberger et al. (2018)). Emission factors were adapted for each country if relevant. Nitrate leaching was calculated according to the INDIGO® v3 method (Bockstaller et al. 2022). The LCAs at farm scale were performed for each case study farms using the Simapro software. The calculation of impact categories were based on ILCD 2011 Midpoints indicators for climate change and water resource depletion (JRC 2012), CML-IA baseline v3.05 for eutrophication and acidification (Guinee et al. 2002), and CED 1.11 for total energy demand (Frischknecht et al. 2015), as implemented in the Simapro software. The background data came from the Ecoalim database (Wilfart et al. 2016) for feed ingredients, from the Agribalyse database® for agricultural operations, machinery, and inputs, and from ecoinvent v3.8® for other background data (national energy mix, transportation and infrastructure). For the biodiversity indicators, the Shannon Index (Shannon 1948), adapted from Di Gregorio and Jansen (2000), Herzog et al. (2013), and Walz (2011), was used to assess habitat diversity. Two other indicators (grassland management and participation in an agri-environmental scheme) were adapted from indicators proposed by Manneville et al. (2014).

In the economic dimension, indicator formulas were defined using EU FADN specification documents (European Commission 2020). For each case study, the farm data was directly used to populate the formulas.

Concerning the social indicators, the theoretical concepts were translated into questions available in the social questionnaire. They consisted of 23 affirmative sentences referring to the three categories of indicators; each of which was analyzed concerning the sustainability of farm work and life outside of farming. Responses to statements were measured on a 5-point Likert scale, ranging from strongly disagree, disagree, neither agree nor disagree, agree, and strongly agree. Each social indicator was obtained by averaging farmers’ responses to the statements referring to that specific indicator and theme.

A detailed description of the values taken by each indicator is available in Balaine et al. (2023).

3 Results and discussion

The method previously described was used to build the DEXi-Dairy tree, which is divided into three main branches related to the three sustainability dimensions. DEXi-Dairy considered each sustainability dimension equally (weighting factors of 33.3% each) in order to capture the whole concept of sustainability. Figure 3 describes the structure of the DEXi-Dairy model including the weighting factors at each node. To describe the complexity and characteristics of DPSs, the model gathered 40 indicators, including 22, 12, and 6 indicators for the environmental, economic, and social branches, respectively. A summary of each branch (including attributes, weighting factors, scales, thresholds, and units) is provided in supplementary material A1. The whole description is available in Baillet et al. (2022a).

Fig. 3.
figure 3

DEXi-Dairy tree. Green boxes represent the environmental dimension, orange boxes represent the economic dimension, and the purple boxes represent the social dimension. The red boxes indicate the weight of the attributes in percentage

3.1 Application to case studies

For each case study, qualitative scores were obtained at each level of the sustainability tree by populating the DEXi-Dairy model with values calculated for each indicator. The overall sustainability scores were obtained through the process of aggregation and are reported in Fig. 4. Three groups of farms can be identified, with sustainability scores ranging from low to medium for NO01; through medium for FR01 and FR02; and up to medium to high for IR01, IR02, DE01, and DE02. Nevertheless, farms with identical scores for the overall sustainability did not perform equally across the three sustainability dimensions. Even though IR01, IR02, and DE01 achieved medium to high scores in the environmental and economic assessments, their medium values for social sustainability could not be compensated. Similarly, DE02 had a medium score for environmental sustainability, which was not compensated by its medium to high scores for the two other dimensions. As expected, FR01 and FR02 obtained a medium sustainability score since the scores for each sustainability dimension were medium, except for FR01’s economic dimension for which the score was medium to high. NO01 obtained the lowest sustainability score of all case studies, i.e., low to medium, remaining on the lower end for each of the three dimensions (low to medium, medium, and low to medium for the environmental, economic, and social dimensions, respectively).

Fig. 4.
figure 4

Sustainability score for each case study using the DEXi-Dairy model (FR France, IE Ireland, DE Germany, NO Norway). Triangles correspond to the individual score for each dimension of sustainability and the diagram to the overall sustainability score of each case study. The gradient of color is distributed as follows:

Tables 2, 3, and 4 provided the environmental, economic, and social profiles of case studies. The case studies differed strongly within their environmental performance, even for a same dimension score (Table 2). For the medium to high group (IR01, IR02, and DE01), IR02 had a high score for the themes “best dairy herd management practices,” “environmental quality,” and “abiotic resources conservation” but a medium score for the theme “biodiversity conservation.” IR01 shared similarities with IR02, except for the theme “adopting best dairy herd management practices” for which it achieved a medium score. DE01 presented a similar pattern as IR02, but differences occurred at the sub-theme level “soil quality” and “reducing pesticide use” for which the scores were low to medium but were compensated by a medium to high score in “feed efficiency.”

Although FR01, FR02, and DE02 had an equivalent medium score for environmental sustainability, their profiles were very different. Indeed, FR01 had mainly medium values for all indicators, with higher scores for “local production” (medium to high) and “by-products used in diet” (high) and lower scores for “contribution to climate change” (the contribution to climate change is high) and “biodiversity conservation.” FR02 had more contrasted scores, with low values for “feed efficiency,” “by-products used in diet,” “age at first calving,” “water quality,” and “contribution to climate change” and high scores for “soil quality” and “abiotic resources conservation.” For DE02, high scores were encountered for “air quality,” “soil quality,” and “reducing water use.”

The Norwegian case study, NO01, presented the lowest environmental score of all case studies, i.e., low to medium. This could be explained by the lower scores obtained for the themes of “environmental quality” and “biodiversity conservation” (very low and low, respectively) and the medium score achieved for “abiotic resources conservation.” A high score was obtained only for “adopting best dairy herd management practices.” It is important to note that all case study farms had poor scores in “biodiversity conservation,” even if some (DE02, IR01, and IR02) had a high score for “reducing pesticide use.”

Table 2 Assessment of the environmental branch for each case study using the DEXi-Dairy model. (FR: France, DE: Germany, IR: Ireland, NO: Norway). The gradient of colour is distributed as follow from dark red for very low score to dark green for very high score

In terms of economic sustainability, most of the case studies had a medium to high score, whereas two farms (FR02 and NO01) obtained a medium score (Table 3). These two farms had the lowest score of all case studies for the theme “resilience” (i.e., low) and a medium score for the theme “profitability.” Concerning the other case studies, even if their overall economic sustainability score was identical, their profile varied strongly. Indeed, the Irish and German farms had high performances in terms of the sub-themes “profit,” “dependence on the market rather in subsidies,” and “direct production costs per farm gross output,” whereas the strongest score (i.e., very high) for FR01 was for the theme “efficiency.” For this farm, 2 out of 3 “efficiency” sub-themes scored high.

Table 3 Assessment of the economic branch for each case study using the DEXi-Dairy model. (FR: France, DE: Germany, IR: Ireland, NO: Norway). The gradient of colour is distributed as follow from dark red for very low score to dark green for very high score

Some structural elements could explain the variability observed among the different cases studies in terms of environmental and economic profiles. The Norwegian farm, NO01, is the smallest farm in the sample with 38.7 livestock units (LU) and 30 ha of UAA. It is also the farm that was the most intensive user of N fertilization on grassland (325.0 kg organic N.ha−1 and 109.3 kg mineral N.ha−1) using a fan spreader. The management and the rate of fertilization led to the highest indicator values of the sample in terms of climate change, eutrophication, acidification, and erosion risk (see supplementary material in Balaine et al. (2023)). In addition to having an adverse effect on farm costs, this high rate of fertilization resulted in larger reactive N emissions to air and water, principally in the form of ammonia (NH3), nitrates (NO3), and nitrous oxide (N2O) (Oenema et al. 2007). Particularly, N losses, in the form of NH3 volatilization and NO3, and the consequent indirect N2O emissions, significantly contribute to climate change, eutrophication, and acidification potential, thus representing a major human health risk (Bowles et al. 2018). Furthermore, FR01 and FR02 had a similar herd size (74 cows on average). However, FR01 had a higher productivity (9,567 l.cow−1), combined with higher N fertilization on grassland (156.6 kg mineral N.ha−1 and 35.0 kg organic N.ha−1). As for FR02, it had a lower productivity (5433 l.cow−1), with cows mainly reared on grassland (94.6 % of the UAA were grassland) and with a low mineral fertilization rate (6.9 kg mineral N.ha−1). IR01 and IR02 are typical Irish farms, with 100 % of the UAA used for grassland. Nevertheless, these two farms differed based on their farm stocking rate (2.02 and 2.70 LU.ha−1, respectively) and their fertilization scheme. Indeed, IR02 applied 2.6 times more kg organic N.ha−1 on grassland than IR01. DE02 was an organic dairy farm with extensive grasslands (230 ha) and a low farm stocking rate (0.55 LU.ha−1). In contrast, DE01 had a highly productive herd (10,870 l.cow−1), as well as a high farm stocking density (2.57 LU.ha−1).

As for the social sustainability indicators, the overall scores differed strongly among case studies (Table 4). IR02 obtained a high score, whereas NO01 obtained a very low score. For this farm, all the indicators obtained the lowest value of all case studies (i.e., low), except for social needs related to the theme “sustainability of farm work” where a medium was reached. FR01, FR02, DE01, and IR01 obtained a medium social sustainability score. A detailed analysis showed that in the theme of “sustainability of farm work,” the lowest scores of all case studies were obtained by farms FR01, IR01, and NO01 (i.e., low to medium, low to medium, and low, respectively). In this theme, DE01 obtained a low score for the indicator “esteem and self-actualization needs,” encompassing statements about the general condition of agriculture in a given country, and personal and farm development. Similarly, a low score for this indicator was obtained by the Norwegian farm, NO01. This farm also scored low for the indicator “physiological and security needs,” measuring the fulfilment of needs related to health, economic security, and social welfare. In addition, the data indicated that most of the respondents were convinced of the negative impact of their work on health, both in the context of increased physical effort and higher accident risk. In both cases, this applies to six out of seven surveyed farmers. The exception was IR02, where this sphere of work was rated highly. Conversely, the other Irish farm, IR01, obtained a low score for the “social needs” indicator, encompassing statements successively concerning social contacts with other farmers, feeling of loneliness at work, and the potential continuation of the farm, for example, after the farm holder retires. High scores for “sustainability of farm work” concerned only IR02, which showed high scores for the “physiological and security needs” and “social needs” indicators. In this case, the indicator “esteem and elf-actualization needs” obtained a medium score. The theme of “sustainability of life outside of farming” also differentiated surveyed farmers. The lowest scores of all case studies were obtained by NO01 (i.e., low) and FR02 and DE01 (i.e., medium). The lowest score of FR02 concerned the fulfilment of social needs in the farmer’s private life (i.e., low), measured by the “social needs” indicator. This indicator is characterized by statements that refer to time for family and potential conflicts with farming tasks, as well as the feeling of loneliness or isolation both in relation to respondents and their closest family members. The lowest level scored for the indicator “esteem and self-actualization needs” was shown by DE01 (i.e., low). It was measured by statements referring to work-related difficulties in realizing one’s desires and interests outside of agriculture. The Irish farm, IR01, scored high on the “physiological and security needs” indicator. This indicator encompasses statements about the fulfilment of needs in the health and safety spheres occurring outside of agricultural work. Notably, these statements measured the ability to take care of health, household financial security, and access to public services and institutions. The data also showed that among the indicators used, the most differentiating farmer responses were the “physiological and security needs” indicators about work on the farm and life outside of it. The main reason for this result was the varied satisfaction with the economic situation of the farm and family. The second category that significantly differentiated responses were “social needs” indicators. Farmers differed the most in this area in their feelings about combining family responsibilities with professional life and continuing work on the farm. Additional features influencing the variability of responses were worldviews and individual perceptions of the farm's operations. These may notably explain the differences observed in the responses of Irish farmers—operating in similar conditions and yet most different from each other in their views. However, larger research samples are needed to analyze the drivers of farm scores in greater detail and with statistical methods.

Table 4 Assessment of the social branch for each case study using the DEXi-Dairy model. (FR: France, DE: Germany, IR: Ireland, NO: Norway). The gradient of colour is distributed as follow from dark red for very low score to dark green for very high score

3.2 Scope and limitations of the DEXi-Dairy model

The framework proposed by DEXi-Dairy focuses on specialized DPSs, with the main goal of assessing sustainability performance and potential GHG mitigation options. It is worthwhile to mention that in order to be applied to other production systems, DEXi-Dairy would need to be adapted to reflect system specificities. DEXi-Dairy can be used as a model to support sustainability assessments and design cycles, as defined by Lopez-Ridaura (2005). It can also be used for the benchmarking of DPSs. The final users of DEXi-Dairy can be researchers, policymakers, agricultural advisors, and farmers, notably to provide an ex post assessment prior to the implementation of GHG mitigation options. The model can then be implemented to monitor the consequences of implementing mitigation options on the three sustainability dimensions. Nevertheless, in terms of complexity, DEXi-Dairy can be categorized as level-three sustainability assessments tools (Coteur et al. 2020), i.e., an expert-based assessment tool (De Olde et al. 2016). This is due to the time-consuming data collection, as well as the presence of LCA-based indicators in the environmental branch. Despite this complexity, DEXi-Dairy can be a useful, visual tool to spark dialogue, thanks to its different aggregation levels. The overall sustainability score can be attractive for farmers and advisors when comparing systems or implementing a new option. The model can give a straightforward indication of whether the farm is overall sustainable by reducing dimensionality issues associated with wide sets of indicators. It also provides farm-specific information about synergies and trade-offs among sustainability aspects, notably when deconstructing the overall sustainability score into its lower levels (Angevin et al. 2017).

Nevertheless, aggregation is widely criticized in the literature (Carpani et al. 2012; Bockstaller et al. 2017) due to the possibility of compensation phenomena among sub-themes and themes. Thus, DEXi-Dairy should be used with caution. As explained by Binder et al. (2012), the single score, without consideration of underlying levels, can lead to simplistic solutions. For this reason, aggregation should be seen as a useful way for clarifying and lifting the complexity of farm sustainability, but comparison among indicators should remain the most important part of the assessment (Chopin et al. 2021). For example, a medium overall sustainability score does not necessarily mean that the analyzed DPS has no large adverse impacts on or within a dimension; indeed, this farm could score very low in some sustainability aspects. Hence, we recommend to present the DEXi-Dairy results in their entirety, i.e., overall sustainability score and performance of each theme, sub-theme, and indicator, as previously underlined by Bélanger et al. (2012).

Compensation among sustainability aspects can be associated with a lack of model sensitivity. For instance, the probability of getting a medium value is high when several decision rules give average classes, as underlined by our study, as well as Craheix et al. (2015) and De Olde et al. (2016). Furthermore, a lack of sensitivity at the final aggregation step may also come from the qualitative approach of DEXi and the choice of threshold values. Bockstaller et al. (2017) presented fuzzy logic as an alternative approach to mitigate this threshold effect. Nevertheless, this alternative is not yet totally operational.

In terms of limitations on the DEXi-Dairy structure itself, it is important to note that the environmental branch (Table 2) was divided into four themes, i.e., “adopting best dairy herd management practices,” “environmental quality,” “abiotic resources conservation,” and “biodiversity conservation.” Since one of the main environmental goals assigned to DEXi-Dairy was to be sensitive to change in management practices and then to foster reduced GHG emissions from DPSs, the themes “adopting best dairy herd management practices” and “environmental quality” were given weighting factors around 30%, whereas “abiotic resources conservation” and “biodiversity conservation” were assigned factors of 17 and 20%, respectively (Fig. 3). Nevertheless, “adopting best dairy herd management” is a sub-theme related to global warming potential, as well as the sub-theme “contribution to climate change.” This induces that the real weight of global warming potential is 56 % in the environmental dimension. This is in line with the GHG goal of DEXi-Dairy but could be considered a limitation for application in other contexts without further weighting adjustments. A critical point is that the theme “adopting best dairy herd management practices” consists of causal indicators (Payraudeau and van der Werf 2005), while the sub-theme “environmental quality” is based on effect indicators. As for the sub-theme “abiotic resources,” it encompasses both effect and causal indicators. This can introduce some discrepancies among impacts. One solution to avoid discrepancies in studies with other overall objectives could be to attribute a weighting factor equal to 0% to these causal indicators. In this way, these indicators could remain in the tree as explanation for farmers or to alert advisors, while having no impact on the overall sustainability score.

More generally, it is important to acknowledge that some practical choices in model development can influence obtained sustainability scores. The previous paragraph provides an example of how the combined choice of indicators, tree structure, and weighting factors can affect the relative importance given to certain sustainability aspects in the assessment. Indeed, as indicators, sub-themes, themes, and dimensions are weighted against each other, a relatively higher weight for one of the attributes ultimately implies a lower importance elsewhere. Similarly, while DEXi-Dairy gathered 40 indicators, choices had to be made as to what indicators had and could be included and how to organize and calculate them. Furthermore, the selection of indicators and their threshold values was a particularly delicate stage since the perception of sustainability is context sensitive and thus not universal (Lebacq et al. 2013). This perception can be affected be factors such as advancement in science, socio-economic and political environments, and culture and hence is constantly evolving (Eckert and Kovalevska 2021). A more precise definition and investigation of the sustainability concept is only possible within social science discourse. Sustainable development emphasizes the diversity of social development paths, depending on their specific cultural or political contexts (Vallance et al. 2011; Sachs 1999). It is also considered essential to study the influence of worldviews and social processes as giving meaning to “sustainability” (Becker and Jahn 1999). In this context, it is difficult to free models and analyses from any possible bias. In our study, efforts were made to provide a comprehensive sustainability picture and compare farms of different conditions and sizes without pre-determining the “winner” and favoring certain systems over others. As a result, the DEXi-Dairy model provides a basis to be expanded in future work as factors such values, norms, political priorities, and production and market conditions evolve.

3.3 Challenges and recommendations for practitioners and future research

Developing a method to assess the sustainability of DPSs is a challenging process, which can raise some issues. Sustainability assessments require a multidisciplinary framework that brings together representatives of different scientific fields. This study gathered animal scientists, agronomists, economists, environmental scientists, LCA experts, and sociologists from several countries. The first challenge was to find a common scientific language and definitions among the various disciplines. For instance, one of the main issues was to find a common definition of social sustainability and then to characterize it through the design of social indicators. Indeed, many authors highlighted that measuring farm social sustainability is challenging due to complexity in defining the term, as well as in quantifying its different components (Janker et al. 2019; Lebacq et al. 2013). Unlike the environmental and economic dimensions, for which generic sustainability assessment frameworks can effectively cover context-specific issues (Gasso et al. 2015), scientific discourses fail to agree on what constitutes social sustainability and how it can be accurately and exhaustively depicted through standard sustainability assessment tools (Brennan et al. 2021; Gaviglio et al. 2016; Janker and Mann 2020). The current state of the art is mainly criticized because social sustainability measurements often lack theoretical depth, notably because of the use of vague conceptual frameworks (Boström 2012; Janker and Mann 2020). However, the sustainability of agricultural systems in general cannot be explored without a deeper consideration of social issues revolving around agricultural production (Janker et al. 2019). In this study, our social assessment aimed at overcoming common issues pointed out in the literature by integrating the standard DEXi multicriteria assessment tool with a theoretical framework. It took a needs-based approach and put it forward as fundamental to social sustainability both in the farmer’s work and in his/her private life. In this way, social sustainability was measured with indexes of related indicator categories derived from literature and completed qualitative research. Thus, the opinions of farmers were key to the social dimension assessment, while the environmental and economic dimensions were based on scientific knowledge only. This explains why the social branch of DEXi-Dairy differed somewhat from our environmental and economic branches and, more broadly, from other social sustainability assessment previously proposed in the literature (Ramos Huarachi et al. 2020; Janker and Mann 2020). For example, interviewed farmers did not bring up animal welfare considerations, which led to their exclusion from the farmer-centered, DEXi-Dairy social indicators.

Another limitation when bringing together different scientific fields was to find a right balance between the number of iterations needed to develop and refine the DEXI-Dairy tree, the final number of indicators, and the management of project timeline. Indeed, the data collection on case study farms was directly linked to selected indicators and parameters needed to calculate them. This collaborative process took time. Specifically, eighteen meetings (Fig. 2) were needed to obtain a tree configuration that responded to study objectives, had a reasonable number of indicators, and could be technically implemented in terms of indicator calculations (e.g., availability of the data and reference values for the threshold definition). Furthermore, the wide diversity of farm and farmer characteristics encountered in the case studies (from Norway to Germany) enhanced the complexity of finding common threshold values in a European context and then of defining scales for each indicator. These aspects should not be neglected in future projects aiming to develop a sustainability assessment tool based on the DEX framework.

The data collection was made through templates specially designed to be as detailed as possible and adapted to the different DPSs across Europe. Since LCA is a method adopted by the European Commission (2018c) and the FAO (2016) as the reference method for environmental assessment, most of the indicators in the environmental branch of this study were based on LCA or an LCI developed as part of an LCA. In their review, Curran et al. (2016) underlined a lack of consensus among the models available for LCA retrieved in the literature. Crenna et al. (2020) pointed out that existing measures of biodiversity in life cycle impact assessment methods do not capture the complexities of biodiversity and are not fully operational for use by LCA practitioners. Even though some methods perform better than others to assess biodiversity, Damiani et al. (2023) highlighted that no current method captures the whole complexity of biodiversity in LCA and beyond. The integration of biodiversity in farm sustainability assessments remains a challenging task, notably due to lack of consensus in the literature, practicality issues, and time and cost constraints (Damiani et al. 2023).

Perrin et al. (2022) underlined that the environmental assessment by LCA at farm scale should consider at least the non-productive areas of the farm. Thus, in DEXi-Dairy, an effort was made to develop some easy-to-handle biodiversity indicators since LCA did not (yet) cover this aspect. The DEXi-Dairy framework with the indicators included in the sub-theme “agroecosystem Biodiversity” attempted to overcome these limitations of the LCA method. The use of LCA/LCI and the combination with other calculation methods implied a large level of detail needed to be collected on each case study farm. Thus, the issues related to the data collection in DEXi-Dairy are the length and complexity of data collection templates (Baillet et al. 2022b). This complexity had two major consequences. First, difficulties in translating the templates into a farmer-friendly language and in filling out the templates based on farmers’ responses were encountered by all project partners. Second, a large amount of time was needed to check all the data before starting the assessment process. Moreover, the economic and social assessments implied to the need to collect some sensitive information (e.g., related to local communities, private life, regularity in payments, debts) that is not always easily shared by farmers. This aspect was already underlined by Constantini et al. (2022). Finally, it is necessary to underline the complexity of interpreting the DEXi models as they are aggregated information which makes discussion of the case study results complex due to compensation along the aggregation process and sometimes lack of sensitivity. A future adaptation of DEXi-Dairy would be to dispense with thresholds by applying the fuzzy logic proposed by Bockstaller et al. (2017), when the method becomes fully operational.

Finally, in our study, sustainability performance was measured on a spectrum, from (very)low to (very)high. We did not adopt an on/off approach nor set threshold values under which farms were considered unsustainable. This is because the literature does not provide a clear indication of what is (un)sustainable, notably due to the heterogeneity in production systems, socio-economic contexts, and agroclimatic conditions. Moreover, our view is that a sustainability gradient is more informative for end-users (such as advisors and farmers) than dichotomous scores. Future work could investigate the possibility of implementing a clear-cut, binary strategy to distinguish farms against each other. However, caution will be needed to study comparable farms (e.g., operating under very similar conditions and in the same region), determine suitable threshold values, and decide whether this strategy should be implemented partially or throughout the whole sustainability tree.

4 Conclusion

In this article, we introduced the DEXi-Dairy framework as a qualitative method to assess the sustainability of dairy production systems across environmental, economic, and social dimensions in an integrated manner. We developed the concept into a tool to be applied at the farm level. We created transparent guidelines for assessing dairy farm sustainability in a European context. Successful application on a test set of seven individual farms representing very different farming conditions proved the feasibility of the approach.

As shown by the results, the wide range of existing farming conditions (e.g., climate, technological advancement, farm size, economic performance, farming tradition) influenced the levels of sustainability reached in case study farms. In several cases, individual sustainability scores in the three sustainability dimensions for a given farm differed widely, indicating trade-offs in their systems. However, we also saw clear cases where a farm was performing exceptionally well, or poor, in all dimensions. While the small number of farms investigated does not allow for quantitative conclusions, this observation still indicates that some farms do generally well in terms of sustainability performance. Thus, these farms may provide a more general benchmark of optimized conditions.

This multicriteria sustainability assessment does not only establish a solid basis to compare farm sustainability performance, it may also serve as a basis to inform different stakeholders about sustainable DPSs. This may lead to better recommendations to reduce GHG emissions for policymakers, while taking into account other parameters than traditional environmental assessments. The challenge of data collection from individual farms, however, could limit the general accessibility of the method. A larger dataset and the inclusion of panel data analysis may provide more insights. Further development of DEXi-Dairy towards using more widely available (statistical) data may be useful in this regard.