Assessment of agricultural sustainability in European Union countries: a group-based multivariate trajectory approach

Sustainability of agriculture is difficult to measure and assess because it is a multidimensional concept that involves economic, social and environmental aspects and is subjected to temporal evolution and geographical differences. Existing studies assessing agricultural sustainability in the European Union (EU) are affected by several shortcomings that limit their relevance for policy makers. Specifically, most of them focus on farm level or cover a small set of countries, and the few exceptions covering a broad set of countries consider only a subset of the sustainable dimensions or rely on cross-sectional data. In this paper, we consider yearly data on 12 indicators (5 for the economic, 3 for the social and 4 for the environmental dimension) measured on 26 EU countries in the period 2004–2018 (15 years), and apply group-based multivariate trajectory modeling to identify groups of countries with common trends of sustainable objectives. An expectation-maximization algorithm is proposed to perform maximum likelihood estimation from incomplete data without relying on an explicit imputation procedure. Our results highlight three groups of countries with distinguished strong and weak sustainable objectives. Strong objectives common to all the three groups include improvement of productivity, increase of personal income in rural areas, reduction of poverty in rural areas, increase of production of renewable energy, rise of organic farming and reduction of nitrogen balance. Instead, enhancement of manager turnover and reduction of greenhouse gas emissions are weak objectives common to all the three groups of countries. Our findings represent a valuable resource to formulate new schemes for the attribution of subsidies within the Common Agricultural Policy (CAP).


3 Introduction
The concept of sustainability is increasingly prominent in current agricultural policy debates. After the first definition provided by the Brundtland Report in the late 1980s (WECD 1987), where a sustainable system meets the needs of the present without compromising the ability of future generations to meet their own needs, the term sustainability has often been used to qualify actions aimed at reducing the impact of human activities on the natural environment. However, the agricultural sector has several other important functions besides the conservation of the ecosystem, for instance the provision of safety and healthy food, and the improvement of socio-economic conditions in rural areas. Therefore, agricultural sustainability is a multidimensional concept involving economy, environment and society (Hayati et al. 2010; Food and Agriculture Organization 2014; Antle and Ray 2020). More precisely, it can be identified with a harmonious interconnection between the efficient production of goods and services (economic function), the management of natural resources (environmental function) and the improvement of conditions in rural areas (social function). The European Union (EU) has paid great attention to the principle of sustainable agriculture, which was integrated into the objectives of the Common Agricultural Policy (CAP) and has found a significant place in the EU scientific research program Horizon 2020 (European Commission 2011), mostly through the concept of bioeconomy. Also, agricultural sustainability is a core theme in the 2030 agenda for the Sustainable Development Goals (SDGs) of the United Nations (UN General Assembly 2015).
The recent research on sustainability of EU agriculture involves the measurement of the sustainable dimensions through sets of indicators (see Latruffe et al. 2016 for a review). However, this task is made difficult by the lack of a unified theoretical framework guiding the selection of indicators (Gennari and Navarro 2019) and by practical issues concerning availability, quality and comparability of data (Nowak et al. 2019). Most existing studies focus on farm level (Gómez-Limón and Sanchez-Fernandez 2010;Majewski 2013;Ryan et al. 2016;Gaviglio et al. 2017) or cover a small set of countries (Öhlund et al. 2015;Radovanović and Lior 2017;Mili and Martínez-Vega 2019;Czubak and Pawlowski 2020), and the few exceptions covering a broad set of countries consider only a subset of the sustainable dimensions (Cristache et al. 2018;Czyzewski et al. 2020) or rely on cross-sectional data (Nowak et al. 2019;Cataldo et al. 2020). Focusing on farm level or on a restricted set of countries precludes to draw an exhaustive picture of agricultural sustainability in the EU. Instead, disregarding or considering only partially some sustainable dimensions prevents to understand agricultural sustainability in all of its relevant aspects. Furthermore, relying on cross-sectional data allows to assess the level of sustainability at a specific instant without accounting for its temporal evolution (trend). We believe that, in the assessment of sustainability, the temporal evolution is more relevant than the level, because it informs whether and to which extent the analyzed economies are pursuing sustainable objectives. Nevertheless, collecting longitudinal data at the purpose of assessing the trend of sustainability is challenging due to issues of availability and quality of publicly released statistics. In fact, some indicators may be unavailable, or their time series may have short length, present structural breaks or contain missing values.
In this paper, we consider yearly data on 12 indicators (5 for the economic, 3 for the social and 4 for the environmental dimension) measured on 26 EU countries in the period 2004-2018 (15 years) and apply group-based multivariate trajectory modeling to identify groups of countries with common trends of sustainable objectives. As such, our research overcomes the main limitations of existing studies and provides a detailed monitor of agricultural sustainability in EU countries in support of policy makers. This paper is structured as follows. In Sect. 2, the selection of indicators and the data collection process are described. In Sect. 3, the methodology is detailed. In Sect. 4, the results are reported and discussed. Section 5 contains concluding remarks and purposes for future work.

Selection of indicators and data collection
In this study, the level of agricultural sustainability in EU countries is measured through a set of indicators covering the three sustainable dimensions, i.e., economic, social and environmental dimensions. The selection of indicators was based on theory and guidelines available in the literature on the assessment of agricultural sustainability (Hayati et al. 2010;Food andAgriculture Organization 2013, 2014;Latruffe et al. 2016;Antle and Ray 2020), and data collection relied on publicly available statistics from the Common Monitoring and Evaluation Framework (CMEF, European Commission 2020), Eurostat (European Commission 2022) and Faostat (Food and Agriculture Organization 2022). We selected a set of indicators and a temporal window as large as possible balancing representativeness of the three sustainable dimensions and availability of time series data. In the data collection process, we tolerated at most one third of missing values for each time series, with no more than two consecutive missing values internal to the time series and no more than two missing values at each extreme.
The resulting dataset comprises 12 indicators: 5 for the economic, 3 for the social and 4 for the environmental dimension, measured yearly on 26 EU countries in the period 2004-2018 (15 years). The selected countries include Austria, Belgium, Bulgaria, Cyprus, Czechia, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Latvia, Lithuania, Luxembourg, Netherlands, Poland, Portugal, Romania, Slovakia, Slovenia, Spain, Sweden, United Kingdom. We excluded Croatia and Malta due to a high occurrence of missing values. Table 1 provides a brief description and the data source for the selected indicators, while a detailed description is contained in Sects. 2.1, 2.2 and 2.3. Table 2 reports the number of missing values by type for each country and indicator. Three different types of missing values are distinguished: internal to the time series (type A), at the beginning of the time series (type B), and at the end of the time series (type C). Basically, the most problematic indicators are the ratio young/elderly for farm managers ( X ECO,3 ) and the production of renewable energy from agriculture ( X ENV,1 ), because their time series include a  A first solution to obtain a complete dataset is to drop the indicators X ECO,3 , X ENV,1 and X ENV,4 , and to let the period of analysis start in 2006 rather than in 2004. However, it is quite undesirable to drop some indicators, because each one provides a valuable and non-substitutable information on sustainable dimensions, and the loss of two time points would shorten significantly the sample size. Thus, imputation of missing values appears a better option. On this point, it should be noted that the three types of missing values are ordered increasingly based on the difficulty of imputation: type A requires interpolation, type B requires forward extrapolation, and type C requires backward extrapolation. Multiple imputation (Rubin 1987) is a widely employed imputation method able to take into account the distribution of missing values in order to correctly assess the uncertainty of the estimated model. Although simple to implement, multiple imputation is often computationally expensive, thus we prefer a third option consisting of performing parameter estimation through the Expectation-Maximization (EM) algorithm (Dempster et al. 1977), which carries out an implicit imputation of missing values. The EM algorithm has proven to be an effective estimation procedure for a broad range of statistical models in case of incomplete data, due to its ability to maximize the expected likelihood with respect to the distribution of missing values. The EM algorithm assumes that data are Missing At Random (MAR), i.e., the probability for a datum to be missing does not depend on its value but only on the observed values (Rubin 1976). Therefore, any missing value for a given statistical unit can be inferred based on the pattern of observed values for that unit. In our case, the MAR assumption allows missing values to be due to (random) measurement errors or loss of data, but excludes them to arise from the refusal of a country to provide the required information. This is a quite reasonable hypothesis considering that the EU international law lays down the obligation for member countries to provide national data according to a precise timetable. Table 3 reports summary statistics for the selected indicators in some different periods. The dataset and the graphics of the time series by country are available as supplementary materials. In overall, it can be seen that most indicators have an average annual change in the period 2004-2018 consistent with sustainability. Indicators with the highest average annual change are the real income of agricultural factors per paid annual work unit ( X ECO,4 , +2.17%), the median equivalised net income in rural areas ( X SOC,1 , +3.74%), the production of renewable energy from agriculture ( X ENV,1 , +19.91%) and the area under organic cultivation ( X ENV,2 , +7.47%). The huge growth rate of indicator X ENV,1 is explainable by the compliance with the 2020 renewable energy targets, according to which the EU has committed to obtain 20% of its energy from renewable sources by 2020. The only indicator with average annual change in 2004-2018 not consistent with sustainability is the ratio young/elderly for farm managers ( X ECO,3 , −3.14%).

Selected economic indicators
The selected economic indicators cover the following objectives: improvement of productivity, maintenance of man-made capital, enhancement of manager turnover and improvement of profitability. Productivity represents the ability of the agricultural sector to efficiently allocate resources in the production of goods and services. For its measurement, we considered the Total Factor Productivity (TFP) index of agriculture with base year 2005 ( X ECO,1 ) sourced from the Common Monitoring and Evaluation Framework (CMEF, European Commission 2020).
The man-made capital of the agricultural sector is represented by capital investments in agriculture. In this study, they are measured by the ratio of net capital stocks to gross value added ( X ECO,2 , source: Faostat). Net capital stocks are preferred to gross fixed capital formation, because they are net of depreciation and indicate the actual value of the capital. The division by the gross value added serves to make the value of the capital comparable across countries.
Manager turnover indicates the propensity of the agricultural sector to renewal the entrepreneurial class in favor of the new generations. Higher levels of manager turnover are typically associated to more dynamic and more profitable entrepreneurial contexts. For the measurement of manager turnover, we considered the ratio young/elderly for farm managers ( X ECO,3 , source: CMEF), where young managers are less than 25-years-old, and elderly managers are more than 55-years-old.
Profitability represents the ability of the agricultural sector to remunerate the production factors (land, labour, capital and entrepreneurship). In this study, it is measured by the real income of agricultural factors per paid annual work unit ( X ECO,4 ) and by the net entrepreneurial income of agriculture per unpaid annual work unit ( X ECO,5 ), both computed by Eurostat as indices with base year 2010.

Selected social indicators
The selected social indicators cover the objective of reducing inequality and poverty in rural areas. They include the median equivalised net income in rural areas ( X SOC,1 ), the at-risk-of-poverty rate in rural areas ( X SOC,2 ) and the unemployment rate in rural areas ( X SOC,3 ), all sourced from Eurostat. The provision of food of good quality is also an objective of the social dimension of sustainability. Unfortunately, there are no hard data (i.e., objectively measured) available on food quality, but our study partially covers this topic through the indicator X ENV,1 (area under organic cultivation), which is employed to measure the environmental dimension of sustainability (see Sect. 2.3).
Further objectives of the social dimension of sustainability like health, working conditions and accessibility to food are not considered in this study. This is a shortcoming independent of our effort because, as confirmed by the review in Latruffe et al. (2016), it is difficult to find hard data of good quality covering the social dimension of sustainability, and, to our knowledge, the selected indicators are the best publicly available measures for EU countries.

Selected environmental indicators
The selected environmental indicators cover two objectives: rise of agricultural practices favoring a positive development of the natural environment and reduction of pressures on the natural environment due to agriculture.
In order to measure the spread of agricultural practices favoring a positive development of the natural environment, we employed two indicators: production of renewable energy from agriculture ( X ENV,1 , source: CMEF) and area under organic cultivation ( X ENV,2 , source: Faostat). The production of renewable energy is at the basis of an environmentally sustainable development, and it is also an important topic in the EU agenda. In fact, based on the 2020 renewable energy targets, the EU has committed to obtain 20% of its energy from renewable sources by 2020. Organic cultivation concerns the use of fertilizers of organic origin, and it is a practice internationally regulated and legally enforced by many countries in the world since the 21st century. Currently, organic cultivation is considered as a positive determinant of environmental conservation and quality of food, thus it also affects the social dimension of sustainability.
In order to measure the pressures on the natural environment due to agriculture, we employed two indicators: greenhouse gas emissions due to agriculture per hectare ( X ENV,1 , source: Faostat) and gross nitrogen balance per hectare ( X ENV,4 , source: Eurostat). Unfortunately, the measurement of the nutrient balance was limited to nitrogen because the time series of the other available nutrients (phosphorus and potassium) contain a large number of missing values. Also, further pressures like soil erosion, deforestation and loss of biodiversity are not considered in our study due to unavailability of data.

Methodology
The methodology employed in this study borrows from group-based trajectory modeling (Nagin 2005), a specialization of finite mixture modeling for longitudinal data where the units in the same group have the same trajectory. In group-based trajectory modeling, the units are assigned to groups based on a single indicator with trajectory defined by a polynomial regression on time within each group. Recently, group-based trajectory modeling was extended to the case of multiple indicators by replacing the univariate polynomial regression within each group with a multivariate one (Nagin et al. 2018). We refer to this methodology as Group-Based Multivariate Trajectory (GBMT) modeling.
The formulation of a GBMT model is provided in Sect. 3.1. In Sect. 3.2, we propose an Expectation-Maximization (EM) algorithm for maximum likelihood estimation that is able to deal with missing values and multiple local maxima. In Sect. 3.3, prediction of trajectories is addressed. Finally, the procedure to select the optimal number of groups and polynomial order is described in Sect. 3.4. Finally, let j = 1, … , J denote the groups, and, for i = 1, … , n , let G i be a latent variable taking value j if unit i belongs to group j.

Group-based multivariate trajectory modeling
A GBMT model with polynomial degree d assumes that the multivariate observation of the indicators on unit i at time t is generated by one multivariate normal (MVN) polynomial regression for each group j = 1, … , J: where: (1) is the (d + 1) × K matrix of regression coefficients in group j, and j is the K × K covariance matrix of the indicators in group j. The likelihood of the model is: is the multivariate normal density of i,t in group j according to formula (2), and j is the prior probability of group j. Note that each indicator should have the same level across units in order to determine the groups entirely based on the trend. In fact, since group-based probability density functions are determined to minimize dissimilarity within groups, if the level dominates the sample variability, then the groups will be mainly determined by the level and will not necessarily be homogeneous in trend (Heggeseth and Jewell 2018).
We define the sample mean vector of unit i as the temporal mean of the multivariate time series of the indicators for unit i: If the sample mean vector is the same for all units, then each indicator has the same level across units. If this is not the case, the sample mean vector can be made constant by normalising the measurements within units. Subtracting the sample mean (centering) is a commonly adopted normalisation method, but it allows comparisons only among the measurements of the same indicator. In order to allow comparisons among the measurements of different indicators, standardization (centering and division by the sample standard deviation) is often applied. Division by the sample mean is a further normalisation method allowing comparisons among different indicators in case of strictly positive measurements, which can be considered as a raw measure of performance (Gursoy et al. 2013;Giusti and Grassini 2013). In this study, we adopt the logarithmic division by the sample mean: The normalisation in (6) has an interpretation similar to the division by the sample mean, in fact it approximates the relative change with respect to the average of the whole period of observation, but, in addition, it brings several valuable advantages to the estimation procedure: numerical stability and convergence are enhanced, and imputed values result positive.
It is important to remark that normalisation provides important benefits for both estimation and interpretation of finite mixture models, especially for multivariate ones, but it may modify the original covariance matrix, thus particular attention should be paid to check whether model assumptions on the dependence structure are supported by the data (Heggeseth and Jewell 2018).

Parameter estimation
For each group j = 1, … , J , the maximum likelihood estimate of parameters j , j and j is obtained through the EM algorithm detailed as Algorithm 1. The algorithm starts from random initial values for missing data and parameters. Afterward, the expectation and the maximization steps (E-step and M-step, respectively) are alternated until convergence of the likelihood (Dempster et al. 1977). The E-step consists of computing the posterior probability of each group j = 1, … , J for each unit i = 1, … , n: where the hat symbol above a parameter denotes the current estimate of that parameter. Instead, the M-step consists of obtaining the maximum likelihood estimate of parameters given posterior probabilities. The maximum likelihood estimate of the prior probability of group j is obtained by averaging the posterior probabilities of group j: Before estimating the other parameters, missing values should be imputed based on their conditional expectation given the observed values and the current estimate of parameters (MAR assumption).
be a partition of a generic multivariate observation into, respectively, the observed and the missing values. Also, let j,t and j be analogously partitioned as:

Assessment of agricultural sustainability in European Union…
It can be shown that the conditional expectation of missing values given that unit i belongs to group j is equal to: Therefore, missing values for each group j = 1, … , J are imputed using formula (10) where j,t and j are replaced by their respective current estimates ̂ j,t =̂ � j 1, t, t 2 , … , t d � and ̂ j . Afterward, the maximum likelihood estimate of parameters j and j for each group j = 1, … , J is obtained by fitting the regression in formula (2) through weighted least squares (Draper and Smith 1981, p. 85-96) with posterior probabilities of group j as weights: where: After convergence of Algorithm 1, each unit i = 1, … , n is assigned to the group with the highest posterior probability: Also, it is possible to compute the expectation of missing values given the observed ones: It is widely known that the likelihood of finite mixture models may have multiple local maxima possibly making the EM algorithm stuck at one of these, rather than to the global one. To overcome this issue, Algorithm 1 is run several times starting from different random initial values, and the estimate with the highest likelihood is retained.
Several software packages exist for the estimation of GBMT models, including traj in SAS (Jones et al. 2001) and Traj in STATA (Jones and Nagin 2013). These two packages employ quasi-Newton optimization (Dennis and Mei 1979;Dennis et al. 1981), implement multiple random restarts, and handle missing values

Prediction
Based on the regression in formula (2), prediction of the multivariate trajectories at time t in group j is obtained as: where ̂ j is the maximum likelihood estimate of j . We denote the predicted value of the k-th indicator at time t in group j, i.e., the k-th element of ̂ j,t , as ̂ j,t,k . The variance of the prediction ̂ j,t,k is equal to: where ̂ 2 j,k is the element in position (k, k) of ̂ j (the maximum likelihood estimate of j ), and matrices and are defined in formula (12). Therefore, (1 − ) ⋅ 100% pointwise prediction intervals for the k-th indicator in group j are obtained as: where t 1− 2 ,n−d−1 is the quantile of order 1 − 2 of a Student's t distribution with n − d − 1 degrees of freedom.
Prediction can also be performed for specific units. Let ŷ i,t,k be the predicted value of the k-th indicator at time t for unit i. It holds: Therefore, (1 − ) ⋅ 100% asymptotic pointwise prediction intervals for the k-th indicator relatively to unit i are obtained as: where z 1− 2 is the quantile of order 1 − 2 of the standard normal distribution.

Model selection
The main challenge in GBMT modeling is the selection of the optimal number of groups. To date, there is no one commonly accepted fit statistic to be exploited at this purpose. Several simulation studies have shown that the performance of currently available fit statistics is highly dependent on data-specific characteristics and that no one of them emerges as superior to the others (see the review in Van der Nest et al. 2020). Therefore, it is important to jointly consider several different fit statistics when establishing the adequate number of groups, bearing in mind that the objective of the model selection process is to summarize the most distinctive features of data through as less complexity as possible (Nagin 2005, p. 77). Fit statistics of common use include information criteria like AIC (Akaike 1974) and BIC (Schwarz 1978): where L is the likelihood, p is the total number of free parameters, and N = nT is the total number of observations. Lower values of an information criterion indicate a better balance between goodness of fit and complexity. AIC has the asymptotic property to select the model with the lowest mean squared error, but it is not a consistent criterion, i.e., it does not identify the true model based on a sample of infinite size. Instead, BIC is consistent but has the tendency to underestimate the optimal complexity in small samples. For these reasons, a correction has been proposed to make AIC consistent (CAIC, Bozdogan 1987) and to reduce the tendency of BIC to underestimate the number of groups (ssBIC, Sclove 1987): A further information criterion worthy of note is the HQIC (Hannan and Quinn 1979): which is consistent like BIC but with a higher convergence rate. Classification criteria represent another family of fit statistics commonly employed to assess the goodness of fit of finite mixture models. The most popular classification criterion is the Average Posterior Probability of Assignment (APPA, Nagin 2005): (22) HQIC = −2 log L + 2p ⋅ log log N where I(⋅) is the indicator function. The APPA value indicates the average posterior probability of the units assigned to each group, thus it can be interpreted as a measure of class separation. However, caution should be taken when interpreting high APPA values as high degree of class separation, because, when the sample size is small, classification criteria tend to take high values even if the degree of class separation is not high as well.
The selection of the GBMT model with the best fit to data requires to establish not only the number of groups, but also the polynomial degree for each indicator within each group. According to Nagin (2005, p. 66-67), the choice of the polynomial degree is less important than the choice of the number of groups. In this perspective, we adopt the following model selection procedure: (i) the optimal number of groups is established based on several fit statistics by keeping fixed a reasonably high polynomial degree (which can be set based on expert opinion, previous studies or visual inspection), then (ii) the best polynomial degree for each indicator within each group is determined through backward elimination of non-significant terms (pruning).

Results and discussion
We normalised the measurements within units as in formula (6) and used Algorithm 1 with 100 random restarts to estimate all the GBMT models with up to six groups ( J = 1, … , 6 ), keeping fixed the polynomial degree to three ( d = 3 ). The maximum number of groups equal to six was chosen in order to have a reasonable number of countries per group (with 26 countries and six groups, we expect four countries per group), while a polynomial degree equal to three, equating to a cubic trend, was deemed sufficient to achieve enough flexibility across 15 time points. The estimated models were then pruned through backward elimination of non-significant polynomial terms for each indicator within each group. The R package gbmt (Magrini 2022) was employed to perform the computations and to obtain the graphics shown in this section. Table 4 reports average number of EM iterations, log likelihood, number of free parameters and values of information criteria for each model after pruning. Models are ranked differently by information criteria, thus we selected the model with the best balance of ranks, which is definitely the one with J = 3 groups, as it is ranked first by BIC, ssBIC and HQIC and second by all the other criteria. APPA values are near one for all groups in all models but, due to the relatively small size of our sample ( n = 26 ), they may not indicate perfect class separation. The groups identified by the selected model ( J = 3 ) are: • Group 1: Austria, Finland, France, Hungary, Italy, Luxembourg, Portugal, Slovenia, Spain; 1 3 • Group 2: Bulgaria, Cyprus, Germany, Greece, Ireland, Latvia, Netherlands, Poland, Sweden, United Kingdom; • Group 3: Belgium, Czechia, Denmark, Estonia, Lithuania, Romania, Slovakia. Figure 1 shows the estimated group trajectories. Parameter estimates and group trajectories in separate graphics are provided in the Appendix as Table 6 and Figs. 3, 4 and 5. All the figures are also available in high resolution as supplementary materials.
It can be seen that, in the selected model, at least one polynomial degree is required in most cases, with only three cases where the trend is constant ( d = 0 ). The inspection of observed trajectories (grey broken lines) in relation to 95% pointwise prediction intervals (dotted curves) in Figs. 3, 4 and 5 points out, for all groups, very few outliers and a small amount of cases where the observed variability is definitely different than the expected one. The most severe inconsistency of this kind involves the production of renewable energy from agriculture ( X ENV,1 ), because the discrepancy between observed and expected variability increases definitely with the trend in all groups, suggesting the unreliability of the assumption of constant covariance matrix within each group. Instead, the other inconsistencies hold only for some groups, like it is the case of agricultural fixed capital ( X ECO,2 ) in Group 1, organic cultivation ( X ENV,2 ) in Group 2, net entrepreneurial income ( X ECO,5 ) and unemployment in rural areas ( X SOC,2 ) in Group 3. Nevertheless, the relatively small number of units in our study is unlikely to allow an effective check of model assumptions on the dependence structure and, above all, the specification of more complex (nonconstant) covariance matrices. Table 5 reports the average annual percentage changes for some different periods implied by group trajectories in the selected model ( J = 3 ). In Table 5, observed average annual percentage changes averaged across countries are shown within brackets to highlight the variability of trajectories within groups. Below, group trajectories are discussed in details.  (6)    Group 1 (Austria, Finland, France, Hungary, Italy, Luxembourg, Portugal, Slovenia, Spain) is characterized by trajectories satisfying sustainability for all indicators excepting the ratio young/elderly for farm managers ( X ECO,3 ), which has a definitely decreasing trend, and greenhouse gas emissions per hectare ( X ENV,3 ), which shows an upward inversion of the trend since 2011. It can be noted that the tendency of unemployment in rural areas ( X SOC,2 ) undergoes an upward inversion inconsistent with sustainability from 2006 to 2014 and that fixed capital stock ( X ECO,2 ) begins to decrease in compliance with sustainability only since 2012. Also, nitrogen balance per hectare ( X ENV,4 ) has a stable trend in the period 2004-2018. We can conclude that countries in Group 1 are characterized by partial achievement of economic and environmental sustainability and by full achievement of social sustainability. Specifically, the weak objectives are enhancement of manager turnover, maintenance of man-made capital and reduction of greenhouse gas emissions, while unemployment in rural areas ( X SOC,2 ) requires attention in the near future to ensure that its trend will not be inverted again. Group 2 (Bulgaria, Cyprus, Germany, Greece, Ireland, Latvia, Netherlands, Poland, Sweden, United Kingdom) is characterized by trajectories satisfying sustainability for all indicators excepting the ratio young/elderly for farm managers ( X ECO,3 ), which has a definitely decreasing tendency, and greenhouse gas emissions per hectare ( X ENV,3 ), which has a definitely increasing trend. It can be noted that, initially, the tendency of unemployment in rural areas ( X SOC,2 ) does not satisfy sustainability but it is inverted in 2013, and nitrogen balance per hectare ( X ENV,4 ) begins to increase slightly since 2015 after a decreasing pattern consistent with sustainability. Also, at-risk-of-poverty in rural areas ( X SOC,3 ) has a stable trend in the period 2004-2018. We can conclude that countries in Group 2 share some weak objectives with countries in Group 1 (i.e., enhancement of manager turnover and reduction of greenhouse gas emissions) but, in general, they are characterized by a stronger growth of sustainability, as shown by the highest absolute annual change for all indicators excepting X SOC,3 (see Table 5). Similarly to what was deduced for Group 1, unemployment in rural areas ( X SOC,2 ) requires attention in the near future to ensure that its trend will not be inverted again. In addition, nitrogen balance per hectare ( X ENV,4 ) should be monitored for the same reason.
Group 3 (Belgium, Czechia, Denmark, Estonia, Lithuania, Romania, Slovakia) is characterized by trajectories not satisfying sustainability for the ratio young/elderly for farm managers ( X ECO,3 ), which has a definitely decreasing tendency, and for the real income of factors ( X ECO,4 ) and the net entrepreneurial income ( X ECO,5 ), which show a downward inversion of the trend since 2013. It can be noted that, initially, the tendency of unemployment in rural areas ( X SOC,2 ) does not satisfy sustainability, but it is inverted in 2013. We can conclude that countries in Group 3 are characterized by full achievement of social and environmental sustainability and by partial achievement of economic sustainability, with enhancement of manager turnover and improvement of profitability as weak objectives. Similarly to what was deduced for Group 1 and Group 2, unemployment in rural areas ( X SOC,2 ) requires attention in the near future to ensure that its change in trend is permanent.
Our approach, besides identifying groups of countries with common trends of sustainable objectives, allows to detect strong and weak objectives for each group, as well as those indicators requiring attention in the near future. Strong objectives common to all the three groups include improvement of productivity, increase of personal income in rural areas, reduction of poverty in rural areas, increase of production of renewable energy, rise of organic farming and reduction of nitrogen balance. Instead, maintenance of man-made capital is a distinguished strong objective of Group 2 (Bulgaria, Cyprus, Germany, Greece, Ireland, Latvia, Netherlands, Poland, Sweden, United Kingdom) and Group 3 (Belgium, Czechia, Denmark, Estonia, Lithuania, Romania, Slovakia), while improvement of profitability is a distinguished strong objective of Group 1 (Austria, Finland, France, Hungary, Italy, Luxembourg, Portugal, Slovenia, Spain) and Group 2 (Bulgaria,Cyprus,Germany,Greece,Ireland,Latvia,Netherlands,Poland,Sweden,United Kingdom). Enhancement of manager turnover is a weak objective common to all the three groups of countries, while reduction of greenhouse gas emissions is a weak objective common to Group 1 (Austria, Finland, France, Hungary, Italy, Luxembourg, Portugal, Slovenia, Spain) and Group 2 (Bulgaria, Cyprus, Germany, Greece, Ireland, Latvia, Netherlands, Poland, Sweden, United Kingdom). Remarkably, greenhouse gas emissions per hectare ( X ENV,3 ) has a stable trend in Group 3 (Belgium, Czechia, Denmark, Estonia, Lithuania, Romania, Slovakia), thus reduction of greenhouse gas emissions is not a strong objective of Group 3 as well. Furthermore, maintenance of man-made capital is a distinguished weak objective of Group 1 (Austria, Finland, France, Hungary, Italy, Luxembourg, Portugal, Slovenia, Spain), while improvement of profitability is a distinguished weak objective of Group 3 (Belgium, Czechia, Denmark, Estonia, Lithuania, Romania, Slovakia). It is worth noting that, in all groups of countries, the trend of unemployment in rural areas ( X SOC,2 ) has begun to satisfy sustainability only in recent years, thus it requires attention in the near future to ensure the persistence of this tendency.
Our findings cannot be properly compared with those of existing studies because our work is the first one in the literature performing a longitudinal assessment on all the three sustainable dimensions for an exhaustive number of EU countries and providing a classification of EU countries based on common trends of sustainability. Among existing studies, the most suited ones for a comparison with our work are Nowak et al. (2019) and Cataldo et al. (2020), because all the three sustainable dimensions and a large number of countries are considered. Unfortunately, these two studies employ cross-sectional data, thus they can rank EU countries only based on the level of sustainability, rather than based on the trend like our study does.
Finally, it is interesting to check the consistency of our results with subsidies attributed by the Common Agricultural Policy (CAP). At this purpose, we downloaded the data on CAP subsidies from the Farm Accountancy Data Network (FADN, European Commission 2020), and computed the distribution of relative changes with respect to country averages for each time point and group (see Fig. 2). It can be seen that the three groups of countries are characterized by distinctive trends of CAP subsidies: in Group 1, they have a definite increasing tendency; in Group 2, they have a stable pattern since 2009 after an initial increase; in Group 3, they have an increasing tendency until 2009, then decreasing until 2015 and increasing again afterward. Also, it is apparent that Group 1 is characterized by the highest growth rate of CAP subsidies, followed by Group 2 and Group 1. Remarkably, this finding is coherent with the average annual changes implied by the estimated group trajectories (see Table 5), where Group 1 is distinguished by the lowest absolute annual change for all indicators excepting X ENV,2 , and Group 3 is characterized by a higher absolute annual change than Group 2 for most indicators. Unfortunately, it is not possible to disentangle available data on CAP subsidies to match the sustainable objectives considered in our study, thus our results cannot be exploited to assess the effectiveness of the CAP. Nevertheless, they represent a valuable resource to formulate new schemes for the attribution of CAP subsidies at the purpose of improving specific weak sustainable objectives of EU member countries. Specifically, our group trajectories emphasize the existence of a trend of the ratio young/elderly for farm managers and of greenhouse gas emissions not consistent with sustainability for all groups of countries, thus we recommend to establish more subsidies for those interventions favoring the turnover of farm managers and preventing pollution due to agriculture.

Concluding remarks
We have proposed the application of group-based multivariate trajectory modeling to the assessment of agricultural sustainability in EU countries. Our research is innovative with respect to existing studies for three main reasons: (i) we selected a broad set of indicators to cover all the three sustainable dimensions (economic, social and environmental dimensions), (ii) we considered a comprehensive set of countries (26 in total) to provide a wide picture of agricultural sustainability in the EU, (iii) we collected data over a long period (15 years) to explore the temporal evolution of agricultural sustainability.
Our approach, besides identifying groups of countries with common trends of sustainable objectives, allows to detect strong and weak objectives for each group, as well as those indicators requiring attention in the near future. Therefore, our results provide a detailed monitor of agricultural sustainability in the EU that can assist policy makers in understanding the dynamics of the various countries and in detecting their specific deficiencies. As emphasized in the discussion of our results, this is of particular relevance to formulate effective schemes for the attribution of CAP subsidies.
Quality and availability of data is a critical issue affecting multidimensional assessments because publicly available time series may have short length, present structural breaks or contain missing values. In our study, we paid particular care to keep the occurrence of missing values under a reasonable threshold, and exploited the expectation-maximization algorithm to perform maximum likelihood estimation from incomplete data without relying on an explicit imputation procedure.
The choice of the normalisation method to apply is another critical issue addressed in our work. Normalising the measurements in finite mixture models serves to determine the groups entirely based on the trend and to allow comparisons among different indicators. We adopted the logarithmic ratio with respect to the average of the whole period of observation, which provides not only a meaningful interpretation, but also computational advantages for parameter estimation.
The main limitation of our work relies in the relatively small number of units (26 countries), which prevented an effective check of the assumption of constant covariance matrix within each group and, above all, the specification of more complex dependence structures. In the future, we plan to identify the sources of the observed inconsistencies with model assumptions (e.g., quality of data, heteroskedasticity, unreliability of the MAR assumption), in view of an effective refinement of model formulation.
Although a valuable feature of our approach is that the various indicators can be monitored separately, future work will be directed toward the combination of groupbased multivariate trajectory modeling and the construction of composite indicators, at the purpose of monitoring a synthetic representation of the three sustainable dimensions. Table 6 shows parameter estimates for the selected model ( J = 3 ), while Figs. 3, 4 and 5 display the estimated group trajectories. Table 6 Estimated parameter values for the selected model ( J = 3 ). Estimated standard errors are shown within brackets. Blank spaces indicate that the term was dropped by the pruning procedure Indicator̂ j,0,k̂ j,1,k̂ j,2,k̂ j,3,k Group 1 ( ̂ 1 = 0.346) Trajectories of countries in Group 2 (Bulgaria, Cyprus, Germany, Greece, Ireland, Latvia, Netherlands, Poland, Sweden, United Kingdom). Values represent relative changes with respect to country averages. The estimated group trajectory is shown in bold, with dotted curves indicating 95% pointwise prediction intervals

Fig. 5
Trajectories for countries in Group 3 (Belgium, Czechia, Denmark, Estonia, Lithuania, Romania, Slovakia). Values represent relative changes with respect to country averages. The estimated group trajectory is shown in bold, with dotted curves indicating 95% pointwise prediction intervals