A flexible Bayesian tool for CoDa mixed models: logistic-normal distribution with Dirichlet covariance

Compositional Data Analysis (CoDa) has gained popularity in recent years. This type of data consists of values from disjoint categories that sum up to a constant. Both Dirichlet regression and logistic-normal regression have become popular as CoDa analysis methods. However, fitting this kind of multivariate models presents challenges, especially when structured random effects are included in the model, such as temporal or spatial effects. To overcome these challenges, we propose the logistic-normal Dirichlet Model (LNDM). We seamlessly incorporate this approach into the R-INLA package, facilitating model fitting, model and model predicting within the framework of Latent Gaussian Models (LGMs). Moreover, we explore metrics like Deviance Information Criteria (DIC), Watanabe Akaike information criterion (WAIC), and cross-validation measure conditional predictive ordinate (CPO) for model selection in R-INLA for CoDa. Illustrating LNDM through a simple simulated example and with an ecological case study on Arabidopsis thaliana in the Iberian Peninsula, we underscore its potential as an effective tool for managing CoDa and large CoDa databases.


Introduction
Compositional Data analysis is an increasingly popular topic for understanding processes that consist in values that correspond to disjoint categories, the sum of which is a constant.Those values are usually proportions or percentages, and in such cases the constant is 1 or 100.The data generated from these processes are widely known as Compositional Data (CoDa).For the sake of simplicity and without loss of generality, from now on, we assume the constant to be 1.Connor and Mosimann [1] proposed Dirichlet regression to deal with CoDa.Since then, several studies have been conducted using this technique, and most of them have proved that it is a very valuable tool for modelling CoDa, see for example Hijazi and Jernigan [2] and Pirzamanbein et al. [3].
There are other approaches to CoDa analysis.Aitchison [4] presented a unified theory, developing a range of methods based on the idea that "information in compositional vectors is concerned with relative, not absolute magnitudes".With this statement, the notion of ratios among proportions emerged and the concept of logratios arose as the preferred method for dealing with CoDa.Modelling CoDa using logistic-normal gained ground, and the bases of CoDa were established.
Nevertheless, one of the biggest problems encountered when dealing with CoDa models is performing inference.To do so, different approaches have been proposed; in particular, many R-packages have been implemented not only from the frequentist perspective [17][18][19], but also from the Bayesian paradigm.R-packages such as BayesX [20], Stan [21], BUGS [22] and R-JAGS [23] have tools for dealing with CoDa.These Bayesian packages are mainly based on Markov chain Monte Carlo (MCMC) methods, which construct a Markov chain whose stationary distribution converges to the posterior distribution.However, the computational cost of MCMC can be high.Moreover, the integrated nested Laplace approximation (INLA) methodology [24], which is mainly intended for approximating the posterior distribution using the Laplace integration method, has become an alternative to MCMC guaranteeing a higher computational speed for Latent Gaussian Models (LGMs).With the incorporation of new techniques from Bayesian variational inference [25,26] and the optimisation of the computation, which improves its parallel performance [27], a new era is emerging in the INLA software.Hence, incorporating a tool for dealing with CoDa would be a convenient way to tackle the large CoDa databases sometimes encountered.
Nonetheless, in R-INLA, it is still a challenge to fit models when we deal with a multivariate likelihood such as the ones defined in S D .There are some approximations for the Dirichlet likelihood that involve converting the original Dirichlet observations into Gaussian pseudo-observations conditioned to the linear predictor [28] or just converting a CoDa multivariate response into coordinates using the isometric log-ratio transformation [14] and fitting them in an independent way.However, there is no unified way to fit these models inside R-INLA and take advantage of all its facilities.
In this paper we present the logistic-normal Dirichlet model (LNDM), which mainly uses logistic-normal distribution with Dirichlet covariance through the additive logratio transformation as likelihood.This allows us to integrate it within the R-INLA package in a very simple way.Thus, we benefit from all the other features of R-INLA for model fitting, model selection and predictions within the framework of LGMs.Additionally, we present how measures such the Deviance Information Criteria [29,DIC], the Watanabe Akaike information criterion [30,31,WAIC], or the crossvalidation measure conditional predictive ordinate (CPO) for evaluating the predictive capacity [32,33] are computed in R-INLA for dealing with CoDa.To show how the method works, a real example in the field of Ecology was implemented, in which we conducted a spatial analysis of the plant Arabidopsis thaliana on the Iberian Peninsula.
The paper is then divided into 6 more sections.Section 2 introduces CoDa, the distributions that can be defined in S D , and their equivalence.Section 3 presents some fundamentals of the INLA methodology.Section 4 is devoted to introducing the logistic-normal regression with Dirichlet covariance.In Section 5, we introduce spatial models as well as model selection measures in CoDa.In Section 6, we provide a real application of this method and, finally, Section 7 concludes and discusses future avenues of research.

CoDa background
This section is devoted to introducing some preliminary concepts for a better understanding of CoDa.In particular, we present some basic and formal definitions of the two main distributions employed when we deal with CoDa.

CoDa. Definitions
Let y D×1 be a vector that satisfies D d=1 y d = 1, and 0 < y d < 1, d = 1, . . ., D. This vector is called a composition, and it pertains to the simplex sample space.The simplex of dimension D, denoted by S D , is defined as As in the ordinary real Euclidean space, there is a geometry defined in S D .It does not follow the usual Euclidean geometry, and it was introduced by Pawlowsky-Glahn and Egozcue [34] and Egozcue et al. [35].It is called Aitchison geometry.The definitions of perturbation and powering are sufficient to obtain a vector space of compositions and the usual properties such as commutativity, associativity and distributivity hold.With the definition of the Aitchison inner product, the Aitchison norm and the Aitchison distance, a Euclidean linear vector space is obtained [34].Following the fundamentals proposed by Aitchison [4], log-ratios play an important role in CoDa analysis.They can be constructed in different ways, including centered log-ratio, isometric log-ratio or additive log-ratio, among others [36].In this work, we focus on the well-known additive log-ratio transformation because of its straightforward interpretation [37], and due to its being a one-to-one mapping from S D to R D−1 .It is defined as: where D is the reference category.In Greenacre et al. [37], the authors depicted some criteria to select the reference category.They recommended choosing the one whose logarithm has low variance as a reference, and avoiding taking a reference with low relative abundances across samples.The new variables generated are called alr-coordinates.The inverse alr, also called alr −1 is .
In addition to Aitchison geometry, several probability distributions have also been characterised in S D [38], although here we focus on the Normal distribution on the simplex or logistic-normal distribution, and the Dirichlet distribution.

Logistic-Normal distribution and Dirichlet distribution
Logistic-normal distribution was defined by Aitchison and Shen [39] and it was studied in depth in Aitchison [4].A D random vector y is said to have a logistic-normal distribution LN (µ, Σ), or alternatively a normal distribution on S D , if any of its vector of log-ratio coordinates has a joint (D − 1)-variate normal distribution.This definition can be adapted straight to a CoDa response using alr-coordinates, as: µ being a D − 1 dimensional vector and Σ a (D − 1) × (D − 1) covariance matrix.Alternatively, the Dirichlet distribution was introduced in Connor and Mosimann [1], and it is the generalisation of the widely known beta distribution.A D random vector y is said to have a Dirichlet distribution D(α), if it has the following probability density:

Relation between the two distributions
As pointed out in Aitchison [4] (pp. 126-129), the logistic-normal and the Dirichlet distribution are separate in the sense that they are never exactly equal for any choice of parameters.However, through the Kullback-Leibler divergence (KL), which measures by how much the approximation q misses the target p, the Dirichlet distribution can be approached with the logistic-normal distribution.The solution to the minimisation of the KL: where p(y | α) represents the density function of the Dirichlet, and q(y | µ, Σ), the logistic-normal density function, is minimised by: and the solution can be written in terms of the digamma ϕ and trigamma ϕ ′ functions as: This approach plays an important role in this paper, as it constitutes the basis for defining logistic normal regression with Dirichlet covariance.But first we introduce the model framework in which this likelihood is included, that is, Latent Gaussian Models [LGMs 24].

LGMs and INLA
The popularity of INLA lies in the fact that it allows fast approximate inference for LGMs.Furthermore, the INLA software is experiencing a new era, facilitated by the integration of novel techniques from Bayesian variational inference [25,26] and enhanced computation optimization, leading to improved parallel performance [27].This section is devoted to briefly introducing the structure of LGMs and how INLA makes inference and prediction with the new advances in INLA.

LGMs
In Van Niekerk et al. [26] a new formulation of INLA is presented.So, we follow it to introduce the notions of INLA.
LGMs can be seen as three-stage hierarchical Bayesian models in which observations y N ×1 can be assumed to be conditionally independent given a latent Gaussian random field X and hyperparameters θ 1 The versatility of the model class is related to the specification of the latent Gaussian field: which includes all the latent (non-observable) components of interest, such as fixed effects and random terms, describing the process underlying the data.The hyperparameters θ = {θ 1 , θ 2 } control the latent Gaussian field and/or the likelihood for the data.
Additionally, the LGMs are a class generalising the large number of related variants of additive and generalised models.If η N ×1 is a column vector representing the linear predictor, then different effects can be added to it: where X is the design matrix for the fixed part (including the first column of 1s if intercepts are added to the model), and β (M +1)×1 is a column vector for the linear effects of X on η. {f } are unknown functions of U .This formulation can be seen as any model where each one of the f l (.) terms can be written in matrix form as A l u l .So, expression (10) can be rewritten as η = AX , with A a sparse design matrix that links the linear predictors to the latent field.When we do inference, the aim is to estimate X (M +1+L)×1 = {β, f }, which represents the set of unobserved latent variables (latent field).If a Gaussian prior is assumed for β and f , the joint prior distribution of X is Gaussian.This yields the latent field X in the hierarchical LGM formulation.The vector of hyperparameters θ contain the non-Gaussian parameters of the likelihood and the model components.These parameters commonly include variance, scale or correlation parameters.
In most cases, the latent field in addition to be Gaussian, is also a Gaussian Markov random field [GMRF,40].A GMRF is a multivariate Gaussian random variable with additional conditional independence properties: x j and x ′ j are conditionally independent given the remaining elements if and only if the (i, j) entry of the precision matrix is 0. Implementation of INLA method use this property to speed up computation.

INLA
The main idea of the INLA approach is to approximate the posteriors of interest: the marginal posteriors for the latent field, p(X m | y), and the marginal posteriors for the hyperparameters, p(θ k | y).With the modern formulation [26], the main enhancement is that the latent field is not augmented with the 'noisy' linear predictors.Then, the joint density of the latent field, hyperparameters and the data is derived as: Thus, the initial step in approaching the posterior distributions involves determining the mode and the Hessian at the mode of p(θ | y): being p G (X | θ, y) the Gaussian approximation to p(X | θ, y) computed as depicted in Van Niekerk et al. [26]: The subsequent step involves obtaining the conditional posterior distributions of the elements in X .To achieve this, it suffices to perform integration θ out from (13) using T integration points θ t and area weights δ t defined by some numerical integration scheme: Finally, the recent proposed Variational Bayes correction to Gaussian means by van Niekerk and Rue [25] is used to efficiently calculate an improved mean for the marginal posterior of the latent field.All this methodology can be used through R with the R-INLA package.For more details about R-INLA we refer the reader to Van Niekerk et al. [26], Blangiardo and Cameletti [41], Zuur et al. [42], Wang et al. [43], Krainski et al. [44], Moraga [45], Gómez-Rubio [46], where practical examples and code guidelines are provided.

INLA for fitting logistic-normal regression with Dirichlet covariance
In this part, we present our approximation for fitting CoDa.

Bayesian logistic-normal regression with Dirichlet covariance
To define the likelihood we will need the logistic-normal distribution and the structure of the variance-covariance matrix presented in (7).
Definition 1. y ∈ S D follows a logistic-normal distribution with Dirichlet covariance LN D(µ, Σ) if and only if alr(y) ∼ N (µ, Σ), and: where σ 2 d + γ represents the variance of each log-ratio and γ is the covariance between log-ratios.
From now on we will refer to N D(µ, Σ) as the multivariate normal with Dirichlet covariance structure, as depicted in Definition 1.Let y be a multivariate random variable such as y ∼ LN D(µ, Σ), which by definition is equivalent to alr(y) ∼ N D(µ, Σ).Because of its easy interpretability in terms of log-ratios with the reference category, we focus on modelling alr(y) as a N D(µ, Σ).
Let µ N ×1 , a column vector representing the linear predictor for the nth observation in the dth alr-coordinate, and X (d) with dimension N × (M (d) + 1), d = 1, . . ., D − 1, the design matrix, which can be different for each d alr-coordinate; in other words, each alr-coordinate can be explained by different covariates.Let f (d) be a set of L (d)  unknown functions of U that also can vary depending on the alr-coordinate.For the sake of simplicity, and without loss of generality, we assume M (d) = M and L (d) = L, fixing the number of covariates and the number of functions as the same in each linear predictor.Finally, we define β (d) (M +1)×1 a M + 1-dimensional column vector that contains the parameters corresponding to the fixed effects including the intercept.
Then, the logistic-normal Dirichlet model (LNDM) can be expressed as follows:

LNDM in R-INLA
R-INLA has been implemented in the sense that each data item is linked to one element of the Gaussian field.Although in this new INLA era, this condition disappears [26], it is still a challenge to fit models with multivariate likelihoods.Some approximations exist for Multinomial likelihood using the Poisson-Laplace trick [47], or the Dirichlet likelihood converting the original Dirichlet observations into Gaussian pseudo-observations conditioned to the linear predictor [28].In our case, the main challenge is to estimate the variance-covariance matrix of the N D(µ, Σ) distribution, in particular, p(γ | y).To do so, we adopt the strategy of modelling each alr-coordinate as if we were modelling multiple likelihoods [44], and the covariance hyperparameter is estimated using independent random effects through the following well-known proposition.
Proposition 1.Let z d , d = 1, . . ., D − 1 be independent Gaussian random variables with different mean µ d variances σ 2 d , and u ∼ N (0, γ).Then, the multivariate random variable y, defined as: follows a multivariate Gaussian with mean µ and covariance matrix Σ whose elements are: This proposition is simple but powerful, as with independent Gaussian distributions and a shared random effect between predictors, p(γ | y) can be easily estimated.So, this structure fits perfectly in the context of LGMs.Thus, to estimate LNDM in R-INLA, we only need to add an individual shared random effect between linear predictors corresponding to the different alr-coordinates.

A simulated example
In this section, we will exemplify, using a simulated scenario, the process of fitting CoDa using R-INLA.To elucidate, we will initiate with a simplistic case featuring solely three categories and one covariate.We will presuppose that the impact of this covariate differs for each predictor.Subsequently, we will designate this model as a Type II model.The model structure with which we are going to operate in this example is: where X N ×2 is a matrix with ones in the first column and values of the covariate simulated from a Uniform distribution between −0.5 and 0.5.Four different parameters compose the model, and they form the latent field: 0 , β 1 , β 1 }.Moreover, three different hyperparameters are included in the model and they form the set of hyperparameters θ = {σ 2  1 , σ 2 2 , γ}.

Fitting the model
For fitting the model, it is required to define priors for the parameters and hyperparameters.Prior considered for the parameters are the default ones used in R-INLA.However, PC-priors [48] are considered for the standard deviations and the root square of the covariance parameter γ, in particular, PC-prior(1, 0.01) were used for σ 1 , σ 2 and √ γ.So, the required formula to be introduced in R-INLA was: In Figures 2 and 3, marginal posterior distributions jointly with the simulated value are depicted showing that we were able to recover the original value.

Spatial LNDM and model selection
Once the LNDM is defined, a particular focus lies on how more intricate structures within the linear predictor can be accommodated within the R-INLA framework.Furthermore, another issue pertains to model selection.Hence, this section is dedicated to spatial LNDMs and the utilization of measures such as DIC, WAIC, and LCPO for model selection.

|θ)
Fig. 2 Marginal posterior distributions for the fixed effects.Simulated original value is also depicted.

Spatial LNDMs
Of particular interest are the LNDMs in the spatial context.The analysis of the spatial process refers to the analysis of data collected in space.Space can be indexed over a discrete domain or a continuous one.So, spatial statistics is traditionally divided into three main areas depending on the type of problem and data: lattice data, Geostatistics and point patterns.For a review of models of different types of spatial data, see Haining and Haining [49] and Cressie and Wikle [50].When a spatial effect has to be included in the model, it is common to formulate mixed-effects regression models in which the linear predictor is made up of a trend plus a spatial variation, the spatial effect being modelled with correlation random effects and matching perfectly the structure presented in Equation ( 16).R-INLA provides many options when implementing Gaussian latent spatial effects [46], including intrinsic conditional autoregressive models (iCAR) or conditional autoregressive models (CAR) for areal data [51] or spatial effect with Matèrn covariance function for continuous processes [52].These effects can be easily included in the LNDM.As we are adopting a multiple likelihood modelling strategy, we make use of the features that R-INLA provides for fitting multiple likelihoods in a jointly way such as copy or replicate, and they are necessary to incorporate shared and replicate random effects in the modelling.For details about its implementation, we refer the reader to the website r.inla.organd books by Krainski et al. [44] and Gómez-Rubio [46].The example presented in this manuscript involves Geostatistical data, but it can be easily applicable to other latent Gaussian effects.

Model selection and validation
Regarding the model selection process, sometimes there are a large number of models resulting from all the possible combinations of covariates, and combining them with the possible latent effects that can be incorporated increases the number of possibilities exponentially.R-INLA has proved to be fast enough to compute huge numbers of models as well as different measures to make the model selection process feasible.Such measures include Deviance Information Criteria [29,DIC], defined as a hierarchical modelling generalisation of the Akaike information criterion (AIC); Watanabe Akaike information criterion [30,31,WAIC], which is the sum of two components: one quantifying the model fit and the other evaluating the model complexity; or the cross-validation measure conditional predictive ordinate (CPO) for evaluating the predictive capacity and its log-score (LCPO) [32,33,LCPO].The models with the lowest values of DIC, WAIC or LCPO have preference over the rest.However, R-INLA is programmed to handle univariate likelihoods, and the variability added with the inclusion of the new random effect is not being considered when the calculation of the deviance is computed.This affects the computation of the DIC and WAIC.So an additional process is needed to calculate DIC and WAIC when the response variable follows a multivariate normal distribution.This process must be able to incorporate the elements that are off the diagonal of the variance-covariance matrix.To achieve this, a post-processing of the model is performed for obtaining samples of the jointly posterior distributions using the feature inla.posterior.samplefunction, and the likelihood of the multivariate normal distribution is calculated.The remaining calculations for DIC and WAIC are straightforward by applying their respective formulas.These two ways have been added in a new function in R.
The same does not apply to the CPO, as it is based on the posterior predictive distribution.In A, there is a proof of why the CPO is not affected by the approach we propose here.However, we believe that the CPO cannot be calculated in the same way when dealing with CoDa, and therefore, we propose a new definition.

CPO
In the context of CoDa cross-validation process, excluding a category from a CoDa point may not make sense, as we know that CoDa have a constraint: their sum must be 1.This implies that the remaining categories provide valuable information about the category we are excluding.One might think that working in the log-ratio coordinates could alleviate this issue, but that is not the case.The reference category is present in all the log-ratios, and thus we encounter a similar situation.At that point, the remaining log-ratio coordinates provide information about the category we have removed during cross-validation.We propose that in the process of cross-validation when dealing with CoDa, it should be carried out by excluding all categories of a particular individual.And then, we define the CPO for a i data point as: being alr(y) i the observed vector for the i-data point after the alr transformation with D − 1 components, and alr(y) −i represents the observed data in alr coordinates (N − 1 data points with D − 1 components for data point) excluding alr(y) i .

Continuous Spatial example: the case of Arabidopsis thaliana
This section is devoted to showing an application of continuous spatial LNDMs in a real setting.

The data
We worked with a collection of 301 accessions of the annual plant Arabidopsis thaliana on the Iberian Peninsula.For each accession, the probability of belonging to each of the 4 genetic clusters (GC) inferred in Martínez-Minaya et al. [53], namely, GC1, GC2, GC3 and GC4, were available (Fig. 4), their sum total being 1.We were interested in estimating the probability of membership, which in this particular context can be thought of as the habitat suitability for each genetic cluster.To do so, we employed LNDMs including climate covariates and spatial terms in the linear predictor.In particular, two bioclimatic variables were used to define the climatic part: annual mean Fig. 4 probability of membership of GC1, GC2, GC3 and GC4 on the Iberian Peninsula.
Fig. 5 Additive log-ratio transformation of the proportion of GC1, GC2, GC3 and GC4 on the Iberian Peninsula, using GC4 as the reference category.

The model
As mentioned earlier, four categories were employed in this problem: GC1, GC2, GC3 and GC4.So, we dealt with proportions in S 4 .To produce the LNDM, we selected GC4 as the reference category because it was the one whose logarithm had the lowest variance.We were thus dealing with a three dimensional N D((µ, Σ).The transformed data is shown in Fig. 5 and the models that we used to solve the problem had the following structure: the third.Therefore, in all three cases, we shall presume the covariate to be relevant and proceed to interpret the coefficients (Fig. 6).We observed that the ratio between the probability of belonging to GC1 and the probability of belonging to GC4 reduces by approximately 20% when the scaled covariate annual mean temperature increased by one unit.For the case of the ratio between the probability of belonging to GC2 and GC4, it decreased by 32% when the scaled covariate annual mean temperature increased by one unit.Finally, the ratio between the probability of belonging to GC3 and GC4 decreased by 50% when the covariate annual mean temperature increased by one unit.
If we focus on the covariate present in the model BIO12 (annual precipitation), we noted that in presence of BIO1, it is relevant with a probability of 0.72 for the coefficient to be lower than 0 in the the first alr-coordinate.Not happen the same for the second and third alr-coordinate, as the probability to be lower than 0 are 0.43 and 0.46 respectively.As a result, we assume the covariate's relevance in the first alrcoordinate and we proceed to interpret its coefficient (Fig. 6).The ratio between the probability of belonging to GC1 and the probability of belonging to GC4 decreases by approximately 6% when the scaled covariate BIO12 increased by one unit and BIO1 remains constant.
With the method implemented here, we are able to make predictions not only on the alr-coordinates scale (Fig. 7), but also on the original scale (Fig. 8).If we focus on Fig. 7, we observe how in the north-west of Spain the ratio between the probability of belonging to GC1 and GC4 reached 12, meaning that at those points the probability of belonging to GC1 is 12 times greater than the probability of belonging to GC4.Something similar happened in the north-east of the Iberian Peninsula, where the probability of belonging to GC2 is 12 times greater than the probability of belonging to GC4.The case of the third alr-coordinate seems a bit different, and the greatest difference between the probability of belonging to GC3 and GC4 is found in the centre of the Iberian Peninsula.
Finally, it is accessible to compute marginal posterior distribution of the hyperparameters and, consequently, the covariance parameter between the alr-coordinates (Fig. 9).

Conclusions and future work
CoDa are becoming more and more common, especially in the context of genomics, and require increasingly powerful computational tools to be analysed.Thus, we believe that finding a way to include a likelihood that can deal with CoDa in the context of LGMs can facilitate inference and predictions.That is why in this manuscript, we have introduced a different way to make inference on Bayesian CoDa analysis.By doing so, we attempt to include it in the context of LGMs, thereby making the range of possibilities that R-INLA offers available to the logistic-normal distribution with Dirichlet covariance likelihood.Fig. 8 Mean and standard deviation of the posterior predictive distribution for the probability of belonging to GC1, GC2, GC3 and GC4.
The main idea underlying the proposed method is to approximate the multivariate likelihoods with univariate ones sharing an independent random effect that can be fitted by R-INLA, in particular, Gaussian likelihoods.This idea is similar to the one proposed for modelling Multinomial likelihood in R-INLA, where using the Poisson trick [47] to reparameterise the model we need to fit independent Poisson observations, or the one proposed in [28] to approximate Dirichlet likelihoods using conditionally independent Gaussians.Simpson et al. [55] also used a similar strategy, constructing a Poisson approximation to the true log-Gaussian Cox process likelihood and making it possible to carry out inference on a regular lattice over the observation window by counting the number of points in each cell.But this work does not intend to be a substitute for the dirinla package [28] or for the Bayesian ilr approach [14]: it is simply a viable alternative when dealing with CoDa that allows the estimation and prediction of very complex models in the context of CoDa.Furthermore, functions are provided for the computation of DIC and WAIC within the framework of R-INLA, accompanied by the definition of the CPO for CoDa.We have reported an example in the field of Ecology, showing the potential of R-INLA when continuous spatial effects can be added in the linear predictor.We have exploited the options that R-INLA has available using tools in the context of multiple likelihoods, such as copy or replicate [46].With them, our aim was to show practitioners the number of models that can be fitted in this context.Although here we have focused mainly on spatial processes, this tool can be easily applied in other contexts: temporal, spatiotemporal, etc., as long as we exprees the model in the context of LGMs. is Gaussian with mean µ β0 and variance σ 2 β0 + σ 2 y .Model II: Regarding model II, depicted in Equation (A3), R is also diagonal matrix in R N ×N whose elements in the diagonal are σ 2 ϵ .V , again is a column matrix in R N ×1 whose elements are 1, and U is an identity matrix in R N ×1 .Then C = (V , U ). m is a column matrix in R (N +1)×1 whose elements are 0. Finally B is a diagonal matrix in R (N +1)×(N +1) , whose first element of the diagonal is 1   .Something similar happens with µ β0 , the first element of the resulting matrix E(β 0 , ω | y), which is Finally, and coming back to Equation (A11), we obtain that the posterior predictive distribution of y ′ | y is Gaussian, with mean µ β0 and variance σ 2 β0 + σ 2 ω + σ 2 ϵ .As a consequence, the two models have the same posterior predictive distributions, and then CPO is equal for both.

Fig. 1
Fig. 1 Top: CoDa simulated represented in the Simplex.Bottom: alr-coordiantes in terms of the generated covariate x.

Fig. 3
Fig.3Marginal posterior distributions for the hyperparameters.Simulated original value is also depicted.

Fig. 6 Fig. 7
Fig.6Marginal posterior distribution for the parameters corresponding to the fixed effects or each of the alr-coordinates: BIO2 and BIO12.