Smoothed Dirichlet Distribution

Wickramasinghe, Lahiru; Leblanc, Alexandre; Muthukumarana, Saman

doi:10.1007/s44199-023-00062-8

Smoothed Dirichlet Distribution

Research Article
Open access
Published: 11 September 2023

Volume 22, pages 237–261, (2023)
Cite this article

Download PDF

You have full access to this open access article

Journal of Statistical Theory and Applications Aims and scope Submit manuscript

Smoothed Dirichlet Distribution

Download PDF

Lahiru Wickramasinghe ORCID: orcid.org/0000-0003-3851-5950¹,
Alexandre Leblanc² &
Saman Muthukumarana²

758 Accesses
1 Citation
Explore all metrics

Abstract

When the cells are ordinal in the multinomial distribution, i.e., when cells have a natural ordering, guaranteeing that the borrowing information among neighboring cells makes sense conceptually. In this paper, we introduce a novel probability distribution for borrowing information among neighboring cells in order to provide reliable estimates for cell probabilities. The proposed smoothed Dirichlet distribution forces the probabilities of neighboring cells to be closer to each other than under the standard Dirichlet distribution. Basic properties of the proposed distribution, including normalizing constant, moments, and marginal distributions, are developed. Sample generation of smoothed Dirichlet distribution is discussed using the acceptance-rejection algorithm. We demonstrate the performance of the proposed smoothed Dirichlet distribution using 2018 Major League Baseball (MLB) batters data.

A fast look-up method for Bayesian mean-parameterised Conway–Maxwell–Poisson regression models

Article Open access 18 May 2023

Simultaneous confidence bands for the distribution function of a finite population and of its superpopulation

Article 05 May 2016

A new discrete XLindley distribution: theory, actuarial measures, inference, and applications

Article 05 May 2023

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The smoothed Dirichlet distribution has the same parametric form as the Dirichlet distribution that forces the neighboring cells to be closer to each other by adding a penalty function. Many researchers currently force the neighboring cells to be closer to each other by using different parameter values ($\varvec{\alpha }$) for the Dirichlet distribution. There is a big necessity for a suitable distribution for the above issue, and our proposed distribution fills that gap, allowing for efficient modeling of data.

A new parametric family of distributions was introduced and developed with some extensions to the Dirichlet distribution to suit different purposes [11]. introduced generalized Dirichlet distribution has a more general covariance structure than Dirichlet distribution. The grouped Dirichlet distribution (GDD) is a multivariate generalization of the Dirichlet distribution, which uses to model incomplete categorical data; it was first described by [5]. Also [6], introduced nested Dirichlet distribution to model incomplete categorical data. The spherical-Dirichlet distribution was introduced by [1], which is obtained by transforming the Dirichlet distribution on the simplex to the corresponding space on the hyper-sphere.

[4] proposed a different version of the smoothed Dirichlet distribution in the context of smoothed language model representation of documents. This proposed smoothed Dirichlet distribution is the same as the Dirichlet distribution except that the domain of the distribution is restricted to include only smoothed language models [3]. Used the above-mentioned smoothed Dirichlet distribution for image categorization using a smoothed simplex. The above proposed smoothed Dirichlet distribution uses smoothed domain and mainly focuses on multimedia data, whereas our proposed smoothed Dirichlet distribution can apply to any context of data [7]. Proposed the modified Dirichlet distribution that simultaneously performs smoothing by making the parameters ($\varvec{\alpha }$) to be negative.

2 Basic Properties

In this section, we introduce the proposed smoothed Dirichlet distribution and its basic properties.

2.1 Probability Density Function

The smoothed Dirichlet distribution was suggested by [2] as a variation to the Dirichlet distribution that forces the successive cell categories to be closer to each other. Let ${\textbf{x}}=(x_{1}, x_{2}, \ldots , x_{K})^{t}$ be a vector with K components where $x_{j} \ge 0$ for $j=1, 2, \ldots , K$ and $\sum _{j=1}^K x_{j} = 1$. Also, let $\varvec{\alpha } = (\alpha _1, \alpha _2, \ldots , \alpha _K)^{t}$, where $\alpha _j >0$ for each j and $\delta >0$. The smoothed Dirichlet (SD) probability density function is

$$\begin{aligned} f({\textbf{x}}|\varvec{\alpha }, \delta ) = C(\varvec{\alpha }, \delta ) \prod _{j=1}^K x_{j}^{\alpha _j -1} \text {exp}(-\delta \Delta ({\textbf{x}})), \end{aligned}$$

where $\text {exp}(-\delta \Delta ({\textbf{x}}))$ is a penalty function that forces successive $x_{j}$’s to be close to each other with higher probability than under the standard Dirichlet distribution satisfying $\sum _{j=1}^K x_{j} = 1$. $\delta$ dictates the extent to which the neighbouring $x_{j}$s to be close. For instance, a large value of $\delta$ forces realizations of ${\textbf{x}}$ to have small values of $\Delta ({\textbf{x}})$. The constant $C(\varvec{\alpha }, \delta )$ can be written as

$$\begin{aligned} C(\varvec{\alpha }, \delta )&= \dfrac{D(\varvec{\alpha })}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{x}}))]} \text { with } {\textbf{x}}\sim \text {Dirichlet}(\varvec{\alpha }) \text { and } D(\varvec{\alpha }) = \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)}. \end{aligned}$$

(1)

From now on, we refer to smoothed Dirichlet distribution (SDD) as ${\textbf{X}} = (X_1, X_2, \ldots , X_K)^{t}$ $\sim \text {SDD} (\varvec{\alpha }, \delta , \Delta )$. Here $\Delta = \Delta ({\textbf{X}})$ and its importance is discussed later.

2.2 Moments

In this sub-section, we compute the first, second, and $n{\text {th}}$ order moments, variance, and covariance of the smoothed Dirichlet distribution. First, we compute the mean of the smoothed Dirichlet distribution is

$$\begin{aligned} E(X_i)&= C(\varvec{\alpha }, \delta ) \int \ldots \int x_{i}^{\alpha _i} \prod _{j \ne i} x_{j}^{\alpha _j -1} \text {exp}(-\delta \Delta ({\textbf{x}})) dx_1 dx_2 \ldots dx_K. \end{aligned}$$

Let $\gamma _{il} = 0, \forall l, l \ne i$, and $\gamma _{ii} = 1$. Here $\varvec{\gamma }_i = (\gamma _{i1}, \gamma _{i2}, \ldots , \gamma _{iK})^{t}$. Then

$$\begin{aligned} E(X_i)&= \dfrac{C(\varvec{\alpha }, \delta )}{C(\varvec{\alpha +\gamma _{i}}, \delta )} \int \ldots \int C(\varvec{\alpha +\gamma _{i}}, \delta ) \prod _{j=1}^K x_{j}^{\alpha _{j} + \gamma _{ij} -1} \text {exp}(-\delta \Delta ({\textbf{x}})) dx_1 dx_2 \ldots dx_K,\\&= \dfrac{C(\varvec{\alpha }, \delta )}{C(\varvec{\alpha +\gamma _{i}}, \delta )}, \end{aligned}$$

where $C(\varvec{\alpha +\gamma _{i}}, \delta ) = \dfrac{1}{\int \ldots \int \prod _{j=1}^K x_{j}^{\alpha _{j} + \gamma _{ij} - 1} \text {exp}(-\delta \Delta ({\textbf{x}})) dx_1 dx_2 \ldots dx_K}$.

Using (1),

$$\begin{aligned} E(X_i)&= \dfrac{\dfrac{D(\varvec{\alpha })}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}}{\dfrac{D(\varvec{\alpha +\gamma _{i}})}{E_{\varvec{\alpha +\gamma _{i}}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}}, \\&= \dfrac{D(\varvec{\alpha })}{D(\varvec{\alpha +\gamma _{i}})} \times \dfrac{E_{\varvec{\alpha +\gamma _{i}}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}. \end{aligned}$$

Here

$$\begin{aligned} D(\varvec{\alpha +\gamma _{i}})&= \dfrac{\Gamma (\sum _{j=1}^K (\alpha _{j} + \gamma _{ji}))}{\prod _{j=1}^K \Gamma (\alpha _{j} + \gamma _{ij})},\nonumber \\&= \dfrac{\Gamma (\sum _{j=1}^K \alpha _{j}+1)}{\prod _{j \ne i}^K \Gamma (\alpha _{j})\Gamma (\alpha _{i}+1)},\nonumber \\&= \dfrac{\Gamma (\sum _{j=1}^K \alpha _{j})\sum _{j=1}^K \alpha _{j}}{\prod _{j \ne i}^K \Gamma (\alpha _{j})\Gamma (\alpha _{i})\alpha _{i}},\nonumber \\&= \dfrac{\sum _{j=1}^K \alpha _{j}}{\alpha _{i}} \times \dfrac{\Gamma (\sum _{j=1}^K \alpha _{j})}{\prod _{j=1}^K \Gamma (\alpha _{j})},\nonumber \\&= \dfrac{\sum _{j=1}^K \alpha _{j}}{\alpha _{i}} D(\varvec{\alpha }). \end{aligned}$$

(2)

Substituting (2),

$$\begin{aligned} E(X_i)&= \dfrac{\alpha _{i}}{\sum _{j=1}^K \alpha _{j}} \times \dfrac{E_{\varvec{\alpha +\gamma _{i}}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}. \end{aligned}$$

(3)

Similarly, we compute the expected value for $X_{i}^n$ as

$$\begin{aligned} E(X_{i}^n)&= \int \ldots \int x_{i}^n f({\textbf{x}}|\varvec{\alpha }, \delta ) dx_1 dx_2 \ldots dx_K,\\&= \int \ldots \int x_{i}^n C(\varvec{\alpha }, \delta ) \prod _{j=1}^K x_{j}^{\alpha _j -1} \text {exp}(-\delta \Delta ({\textbf{x}})) dx_1 dx_2 \ldots dx_K,\\&= C(\varvec{\alpha }, \delta ) \int \ldots \int x_{i}^{\alpha _i+n-1} \prod _{j \ne i} x_{j}^{\alpha _j -1} \text {exp}(-\delta \Delta ({\textbf{x}})) dx_1 dx_2 \ldots dx_K. \end{aligned}$$

Then

$$\begin{aligned} E(X_{i}^n)&= \dfrac{C(\varvec{\alpha }, \delta )}{C(\varvec{\alpha }+n\varvec{\gamma }_{i}, \delta )} \int \ldots \int C(\varvec{\alpha }+n\varvec{\gamma }_{i}, \delta ) \prod _{j=1}^K x_{j}^{\alpha _{j} +n\gamma _{ij} -1} \text {exp}(-\delta \Delta ({\textbf{p}})) dx_1 dx_2 \ldots dx_K,\\&= \dfrac{C(\varvec{\alpha }, \delta )}{C(\varvec{\alpha }+n\varvec{\gamma }_{i}, \delta )}, \end{aligned}$$

where $C(\varvec{\alpha }+n\varvec{\gamma }_{i}, \delta ) = \dfrac{1}{\int \ldots \int \prod _{j=1}^K x_{j}^{\alpha _{j} + n\gamma _{ij} -1} \text {exp}(-\delta \Delta ({\textbf{x}})) dx_1 dx_2 \ldots dx_K}$.

Using (1),

$$\begin{aligned} E(X_{i}^n)&= \dfrac{\dfrac{D(\varvec{\alpha })}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}}{\dfrac{D(\varvec{\alpha }+n\varvec{\gamma }_{i})}{E_{\varvec{\alpha }+n\varvec{\gamma }_{i}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}},\\&= \dfrac{D(\varvec{\alpha })}{D(\varvec{\alpha }+n\varvec{\gamma }_{i})} \times \dfrac{E_{\varvec{\alpha }+n\varvec{\gamma }_{i}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}. \end{aligned}$$

Here

$$\begin{aligned} D(\varvec{\alpha }+n\varvec{\gamma }_{i})&= \dfrac{\Gamma (\sum _{j=1}^K (\alpha _{j} + n\gamma _{ij}))}{\prod _{j=1}^K \Gamma (\alpha _{j} + n\gamma _{ij})},\nonumber \\&= \dfrac{\Gamma (\sum _{j=1}^K \alpha _{j}+n)}{\prod _{j \ne i}^K \Gamma (\alpha _{i})\Gamma (\alpha _{i}+n)},\nonumber \\&= \dfrac{\Gamma (\sum _{j=1}^K \alpha _{j}+n)}{\prod _{j \ne i}^K \Gamma (\alpha _{j})\Gamma (\alpha _{i}+n)} \times \dfrac{\Gamma (\alpha _{i})}{\Gamma (\alpha _{i})} \times \dfrac{\Gamma (\sum _{j=1}^K \alpha _{j})}{\Gamma (\sum _{j=1}^K \alpha _{j})},\nonumber \\&= \dfrac{\Gamma (\sum _{j=1}^K \alpha _{j}+n)}{\Gamma (\sum _{j=1}^K \alpha _{j})} \times \dfrac{\Gamma (\alpha _{i})}{\Gamma (\alpha _{i} +n)} \times \dfrac{\Gamma (\sum _{j=1}^K \alpha _{j})}{\prod _{j=1}^K \Gamma (\alpha _{j})},\nonumber \\&= \dfrac{\Gamma (\sum _{j=1}^K \alpha _{j}+n)}{\Gamma (\sum _{j=1}^K \alpha _{j})} \times \dfrac{\Gamma (\alpha _{i})}{\Gamma (\alpha _{i} +n)} \times D(\varvec{\alpha }). \end{aligned}$$

(4)

Substituting (4),

$$\begin{aligned} E(X_{i}^n)&= \dfrac{\Gamma (\sum _{j=1}^K \alpha _{j})}{\Gamma (\sum _{j=1}^K \alpha _{j}+n)} \times \dfrac{\Gamma (\alpha _{i} +n)}{\Gamma (\alpha _{i})} \times \dfrac{E_{\varvec{\alpha }+n\varvec{\gamma }_{i}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}. \end{aligned}$$

(5)

More generally, the product moments of smoothed Dirichlet distribution random variables can be expressed as

$$\begin{aligned} E\left( \prod _{i=1}^K X_{i}^{n_i}\right)&= \dfrac{\Gamma (\sum _{j=1}^K \alpha _{j})}{\Gamma (\sum _{j=1}^K (\alpha _{j}+n_j))} \times \left[ \prod _{i=1}^K\dfrac{\Gamma (\alpha _{i} +n_i)}{\Gamma (\alpha _{i})} \right] \times \dfrac{E_{\varvec{\alpha }+{\varvec{n}}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}, \end{aligned}$$

(6)

where $\varvec{\alpha }+{\varvec{n}} = (\alpha _1+n_1, \alpha _2+n_2, \ldots , \alpha _K+n_K)^{t}$. We can easily compute the second moment by plugging $n=2$ to (5),

$$\begin{aligned} E(X_{i}^2)&= \dfrac{\Gamma (\sum _{j=1}^K \alpha _{j})}{\Gamma (\sum _{j=1}^K \alpha _{j}+2)} \times \dfrac{\Gamma (\alpha _{i} +2)}{\Gamma (\alpha _{i})} \times \dfrac{E_{\varvec{\alpha }+2\varvec{\gamma }_{i}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}, \nonumber \\&=\dfrac{\alpha _{i}}{\sum _{j=1}^K \alpha _{j}} \times \dfrac{\alpha _{i}+1}{\sum _{j=1}^K \alpha _{j}+1} \times \dfrac{E_{\varvec{\alpha }+2\varvec{\gamma }_{i}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}. \end{aligned}$$

(7)

Then the variance of the smoothed Dirichlet distribution is

$$\begin{aligned} \text {Var} (X_i)&= E(X_{i}^2) - (E(X_{i}))^2, \nonumber \\&= \dfrac{\alpha _{i}}{\sum _{j=1}^K \alpha _{j}} \times \dfrac{\alpha _{i}+1}{\sum _{j=1}^K \alpha _{j}+1} \times \dfrac{E_{\varvec{\alpha }+2\varvec{\gamma }_{i}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]} \nonumber \\&\quad - \left( \dfrac{\alpha _{i}}{\sum _{j=1}^K \alpha _{j}} \times \dfrac{E_{\varvec{\alpha }+\varvec{\gamma }_{i}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]} \right) ^2, \nonumber \\&= \dfrac{\alpha _{i}}{\left( \sum _{j=1}^K \alpha _{j}\right) ^2 \left( \sum _{j=1}^K \alpha _{j} +1\right) E_{\varvec{\alpha }}^2 [\text {exp}(-\delta \Delta ({\textbf{X}}))]} \nonumber \\&\times \left( (\alpha _i +1)\left( \sum _{j=1}^K \alpha _{j}\right) E_{\varvec{\alpha }+2\varvec{\gamma }_{i}}[\text {exp}(-\delta \Delta ({\textbf{X}}))] E_{\varvec{\alpha }} [\text {exp}(-\delta \Delta ({\textbf{X}}))] \right. \nonumber \\&\quad \left. - \alpha _i \left( \sum _{j=1}^K \alpha _{j}+1\right) E_{\varvec{\alpha }+\varvec{\gamma }_{i}}^2 [\text {exp}(-\delta \Delta ({\textbf{X}}))]\right) . \end{aligned}$$

(8)

Next we compute the covariance of $X_i$ and $X_l$ ($\text {Cov} (X_i, X_l)$), $i, l =1, 2, \ldots , K \text { and } i \ne l$. Using (6),

$$\begin{aligned} E (X_i X_l)&= \dfrac{\Gamma (\sum _{j=1}^K \alpha _{j})}{\Gamma (\sum _{j=1}^K \alpha _{j}+2)} \times \left[ \dfrac{\Gamma (\alpha _{i}+1)}{\Gamma (\alpha _{i})} \right] \times \left[ \dfrac{\Gamma (\alpha _{l}+1)}{\Gamma (\alpha _{l})} \right] \times \dfrac{E_{\varvec{\alpha }+\varvec{\gamma }_i + \varvec{\gamma }_l}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}\\&= \dfrac{\alpha _i \alpha _l}{\sum _{j=1}^K \alpha _{j} \left( \sum _{j=1}^K \alpha _{j}+1\right) } \times \dfrac{E_{\varvec{\alpha }+\varvec{\gamma }_i + \varvec{\gamma }_l}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}. \end{aligned}$$

Then

$$\begin{aligned} \text {Cov} (X_i, X_l)&= E (X_i X_l) - E (X_i) E(X_l)\\&= \dfrac{\alpha _i \alpha _l}{\sum _{j=1}^K \alpha _{j} \left( \sum _{j=1}^K \alpha _{j}+1\right) } \times \dfrac{E_{\varvec{\alpha }+\varvec{\gamma }_i + \varvec{\gamma }_l}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]} \\&- \dfrac{\alpha _{i}}{\sum _{j=1}^K \alpha _{j}} \times \dfrac{E_{\varvec{\alpha }+\varvec{\gamma }_{i}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]} \times \dfrac{\alpha _{l}}{\sum _{j=1}^K \alpha _{j}} \times \dfrac{E_{\varvec{\alpha }+\varvec{\gamma }_{l}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}\\&= \dfrac{\alpha _{i}\alpha _{l}}{\left( \sum _{j=1}^K \alpha _{j}\right) ^2 \left( \sum _{j=1}^K \alpha _{j} +1\right) E_{\varvec{\alpha }}^2 [\text {exp}(-\delta \Delta ({\textbf{X}}))]}\\&\times \left( \scriptstyle \left( \sum _{j=1}^K \alpha _{j}\right) E_{\varvec{\alpha }+\varvec{\gamma }_{i}+\varvec{\gamma }_{l}}[\text {exp}(-\delta \Delta ({\textbf{X}}))] E_{\varvec{\alpha }} [\text {exp}(-\delta \Delta ({\textbf{X}}))] \right. \\&\quad \left. - \left( \sum _{j=1}^K \alpha _{j}+1\right) E_{\varvec{\alpha }+\varvec{\gamma }_{i}} [\text {exp}(-\delta \Delta ({\textbf{X}}))] E_{\varvec{\alpha }+\varvec{\gamma }_{l}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]\right) . \end{aligned}$$

2.3 Sample Generation

We use the acceptance-rejection method to generate a random sample from the smoothed Dirichlet distribution with parameters $\varvec{\alpha }, \delta$, and $\Delta$. In the acceptance-rejection sampling method, we generate sampling values from a target distribution ${\varvec{X}}$ by using a proposal distribution ${\varvec{Y}}$. We generate the values from ${\varvec{Y}}$ instead of ${\varvec{X}}$ and accept the values of ${\varvec{Y}}$ if $f({{\varvec{x}}}) \le C g({\varvec{y}})$ where f and g are probability density functions of ${\varvec{X}}$ and ${\varvec{Y}}$, and C is a constant. We consider the Dirichlet distribution as the target distribution. Here are the steps of the acceptance-rejection algorithm.

1.
Generate a set of random samples ${\varvec{y}}_1, {\varvec{y}}_2, \ldots , {\varvec{y}}_M$ from the Dirichlet distribution with given $\varvec{\alpha }$.
2.
Compute $E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{y}}))]$ using all the generated random samples.
3.
Generate a random number u from Uniform(0,1) and set $i=1$.
4.
If $u < C= \dfrac{\text {exp}(-\delta \Delta ({\textbf{y}}_i)) }{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{y}}_i))]}$ accept ${\varvec{y}}_i$ and otherwise reject ${\varvec{y}}_i$, set $i=i+1$ and repeat the steps again until $i=M$.

2.4 Role of the Penalty Term ($\text {Exp}(-\delta \Delta ({\textbf{x}}))$)

As we mentioned before, the penalty function which forces successive $x_{j}$’s to be close to each other with higher probability than under the standard Dirichlet distribution satisfying $\sum _{j=1}^K x_{j} = 1$. The roles of $\delta$ and $\Delta$ are very important when constructing the smoothed Dirichlet distribution. The penalty parameter ($\delta$) dictates the extent to which cell probabilities of neighboring categories have to be similar. Some examples of $\Delta$ functions that we consider are

$$\begin{aligned} \Delta _1 = \sum _{j=1}^{K-1} (x_{j+1} - x_j)^2, \Delta _2 = \sum _{j=1}^{K-1} (\text {log } x_{j+1} - \text {log } x_j )^2, \text { and }\Delta _3 = \sum _{j=2}^{K-1} (x_{j+1} - 2x_j + x_{j-1})^2. \end{aligned}$$

From now on, for our analysis, we use $\Delta = \sum _{j=1}^{K-1} (x_{j+1} - x_{j})^2$. When $x_1 = x_2 = \ldots = x_K = \frac{1}{K}$ then

$$\begin{aligned} \Delta = \left( \frac{1}{K} - \frac{1}{K}\right) ^2 +\left( \frac{1}{K} - \frac{1}{K}\right) ^2 + \ldots + \left( \frac{1}{K} - \frac{1}{K}\right) ^2 = 0. \end{aligned}$$

This is the minimum value of $\Delta = \sum _{j=1}^{K-1} (x_{j+1} - x_{j})^2$. Also, the maximum value of $\Delta$ is attained when one of the $x_j=1$, $2 \le j \le K-1$. Then the maximum value of $\Delta$ is

$$\begin{aligned} \Delta&= \sum _{j=1}^{K-1} (x_{j+1} - x_{j})^2,\\&\le \sum _{j=1}^{K-1} x_{j+1}^2 + \sum _{j=1}^{K-1} x_{j}^2,\\&\le \sum _{j=1}^{K-1} x_{j+1} + \sum _{j=1}^{K-1} x_{j} \le 2. \end{aligned}$$

Fig. 1 shows the effect of different $\delta$ and $\varvec{\alpha }$ for the same penalty function $\Delta = \sum _{j=1}^{K-1} (x_{j+1} - x_{j})^2$ after simulating data from the smoothed Dirichlet distribution with $K=3$. The plots in each row of this Fig. 1 are for the same $\varvec{\alpha }$, and the plots in each column are for the same $\delta$ value. We consider $(1, 1, 1)^{t}$, $(5, 5, 5)^{t}$, $(10, 10, 10)^{t}$, $(10, 5, 1)^{t}$, $(1, 5, 10)^{t}$ and $(1, 10, 1)^{t}$ as $\varvec{\alpha }$ for the plots in each row of Fig. 1. The color scale runs from yellow (lowest value) to red (highest value).

As $\alpha _j$ increases, the distribution becomes more tightly concentrated around the center of the simplex for a given $\delta$ value. Also, for a given $\varvec{\alpha }$, the distribution becomes more tightly concentrated around the center of the simplex as $\delta$ increases. If $\alpha _j$ values are different, then we will get an asymmetric (non-central) distribution with higher values for the highest $\alpha _j$. Also, the highest penalty occurs at $(x_1 = 0, x_2 = 1, x_3 = 0)^{t}$ for this penalty function.

Figure 2 shows the effect of different $\delta$ and penalty functions for $\varvec{\alpha } = (5, 5, 5)^{t}$ after simulating data from the smoothed Dirichlet distribution with $K=3$. The first row of this Fig. 2 is for the penalty function $\Delta _1= \sum _{j=1}^{K-1} (x_{j+1} - x_{j})^2$. The second and third rows of this Fig. 2 are for the penalty functions $\Delta _2 = \sum _{j=1}^{K-1} (\text {log } x_{j+1} - \text {log } x_j )^2$ and $\Delta _3 = \sum _{j=2}^{K-1} (x_{j+1} - 2x_j + x_{j-1})^2$ respectively. It is clear that the distribution becomes more tightly concentrated around the center of the simplex for the penalty function $\Delta _1 = \sum _{j=1}^{K-1} (\text {log } x_{j+1} - \text {log } x_j )^2$ than other penalty functions. Also, the highest penalty occurs at $(x_1 = 0, x_2 = 1, x_3 = 0)^{t}$ for all of these penalty functions.

2.5 The Upper and Lower Limits for $E(X_i)$ and $\text {Var} (X_i)$

Next, we compute the upper and lower limits for $E(X_i)$ and $\text {Var} (X_i)$. The upper and lower limits help to find out the possible range for $E(X_i)$ and $\text {Var} (X_i)$. First, we compute the upper and lower limits for $E(X_i)$. We know that the value of $\Delta ({\textbf{x}}) = \sum _{j=1}^{K-1} (x_{j+1} - x_{j})^2$ is in [0, 2]. Then

$$\begin{aligned} 0 \le \Delta ({\textbf{x}}) \le 2\\ -2\delta \le -\delta \Delta ({\textbf{x}}) \le 0\\ \text {exp} (-2\delta ) \le \text {exp} (-\delta \Delta ({\textbf{x}}) ) \le 1. \end{aligned}$$

Now we can compute

$$\begin{aligned}&\int \ldots \int \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \prod _{j=1}^K x_{j}^{\alpha _j-1} \text {exp} (-2\delta ) dx_1 dx_2 \ldots dx_K \\&\qquad \le \int \ldots \int \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \prod _{j=1}^K x_{j}^{\alpha _j-1} \text {exp} (-\delta \Delta ({\textbf{x}}) ) dx_1 dx_2 \ldots dx_K\\&\quad \le \int \ldots \int \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \prod _{j=1}^K x_{j}^{\alpha _j-1} dx_1 dx_2 \ldots dx_K,\\&\quad \text {exp} (-2\delta ) \int \ldots \int \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \prod _{j=1}^K x_{j}^{\alpha _j-1} dx_1 dx_2 \ldots dx_K\\&\qquad \le \int \ldots \int \text {exp} (-\delta \Delta ({\textbf{x}}) ) \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \prod _{j=1}^K x_{j}^{\alpha _j-1} dx_1 dx_2 \ldots dx_K\\&\quad \le \int \ldots \int \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \prod _{j=1}^K x_{j}^{\alpha _j-1} dx_1 dx_2 \ldots dx_K. \end{aligned}$$

Then we will get

$$\begin{aligned} \text {exp} (-2\delta ) \le E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))] \le 1. \end{aligned}$$

(9)

Similarly,

$$\begin{aligned}&\int \ldots \int x_i \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \prod _{j=1}^K x_{j}^{\alpha _j-1} \text {exp} (-2\delta ) dx_1 dx_2 \ldots dx_K \\&\qquad \le \int \ldots \int x_i \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \prod _{j=1}^K x_{j}^{\alpha _j-1} \text {exp} (-\delta \Delta ({\textbf{x}}) ) dx_1 dx_2 \ldots dx_K\\&\quad \le \int \ldots \int x_i \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \prod _{j=1}^K x_{j}^{\alpha _j-1} dx_1 dx_2 \ldots dx_K,\\&\quad \text {exp} (-2\delta ) \int \ldots \int x_i \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \prod _{j=1}^K x_{j}^{\alpha _j-1} dx_1 dx_2 \ldots dx_K \\&\quad \le \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \dfrac{\prod _{j=1}^K \Gamma (\alpha _{j} + \gamma _{ij})}{\Gamma (\sum _{j=1}^K \alpha _{j} + \gamma _{ij})}\int \ldots \int \text {exp} (-\delta \Delta ({\textbf{x}}) )\\&\qquad \dfrac{\Gamma (\sum _{j=1}^K \alpha _{j} + \gamma _{ij})}{\prod _{j=1}^K \Gamma (\alpha _{j} + \gamma _{ij})} \prod _{j=1}^K x_{j}^{\alpha _{j} + \gamma _{ij} -1} dx_1 dx_2 \ldots dx_K\\&\quad \le \int \ldots \int x_i \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \prod _{j=1}^K x_{j}^{\alpha _j-1} dx_1 dx_2 \ldots dx_K, \end{aligned}$$

Then

$$\begin{aligned} \text {exp} (-2\delta ) E_{\varvec{\alpha }} (X_i) \le \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \dfrac{\prod _{j \ne i} \Gamma (\alpha _{j}) \Gamma (\alpha _{i}+1)}{\Gamma (\sum _{j=1}^K \alpha _{j}+1)} E_{\varvec{\alpha }+\varvec{\gamma }_{i}}[\text {exp}(-\delta \Delta ({\textbf{X}}))] \le E_{\varvec{\alpha }} (X_i), \end{aligned}$$

$$\begin{aligned} \text {exp} (-2\delta ) E_{\varvec{\alpha }} (X_i) \le \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j} E_{\varvec{\alpha }+\varvec{\gamma }_{i}}[\text {exp}(-\delta \Delta ({\textbf{X}}))] \le E_{\varvec{\alpha }} (X_i). \end{aligned}$$

(10)

By using 9 and 10,

$$\begin{aligned} \text {exp} (-2\delta ) E_{\varvec{\alpha }} (X_i) \le \left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j} \right) \dfrac{E_{\varvec{\alpha }+\varvec{\gamma }_{i}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]} \le \dfrac{1}{\text {exp} (-2\delta )}E_{\varvec{\alpha }} (X_i). \end{aligned}$$

(11)

We know that $E_{\varvec{\alpha }} (X_i) = \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j}$. Then,

$$\begin{aligned} \text {exp} (-2\delta ) \left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j} \right) \le \left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j} \right) \dfrac{E_{\varvec{\alpha }+\varvec{\gamma }_{i}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]} \le \dfrac{1}{\text {exp} (-2\delta )}\left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j} \right) , \end{aligned}$$

$$\begin{aligned} \text {exp} (-2\delta ) \left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j} \right) \le E (X_i) \le \text {exp} (2\delta )\left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j} \right) . \end{aligned}$$

(12)

The lower limit of $E (X_i)$ is $\text {exp} (-2\delta ) \left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j} \right)$ and the upper limit of $E (X_i)$ is $\text {exp} (2\delta )$ $\left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j} \right)$. Note that when $\delta \rightarrow 0$, $E (X_i) \rightarrow \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j}$ which is the expected value of the Dirichlet distribution. Next, we compute the upper and lower limits that help to find out the possible range for $E(X_i)$ and $\text {Var} (X_i)$. We know that

$$\begin{aligned} \text {exp}(x) = \sum _{k=1}^{\infty } \dfrac{x^k}{k!}. \end{aligned}$$

This is also called the Maclaurin series. Then,

$$\begin{aligned} \text {exp}(-2\delta )&= 1 + \dfrac{(-2\delta )^1}{1!}+ \dfrac{(-2\delta )^2}{2!} + \ldots \\&= 1 - 2 \delta + 2 \delta ^2 - \dfrac{4}{3}\delta ^3+ \ldots \\&= 1 -2 \delta + O(\delta ^2). \end{aligned}$$

Similarly, $\text {exp}(2 \delta ) = 1 + 2 \delta + O(\delta ^2)$. Then,

$$\begin{aligned} (1 -2 \delta ) \left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j} \right) \le E (X_i) \le (1 + 2 \delta ) \left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j} \right) . \end{aligned}$$

(13)

Also,

$$\begin{aligned}&\int \ldots \int x_{i}^2 \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \prod _{j=1}^K x_{j}^{\alpha _j-1} \text {exp} (-2\delta ) dx_1 dx_2 \ldots dx_K \\&\quad \le \int \ldots \int x_{i}^2 \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \prod _{j=1}^K x_{j}^{\alpha _j-1} \text {exp} (-\delta \Delta ({\textbf{x}}) ) dx_1 dx_2 \ldots dx_K\\&\quad \le \int \ldots \int x_{i}^2 \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \prod _{j=1}^K x_{j}^{\alpha _j-1} dx_1 dx_2 \ldots dx_K,\\&\quad \text {exp} (-2\delta ) \int \ldots \int x_{i}^2 \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \prod _{j=1}^K x_{j}^{\alpha _j-1} dx_1 dx_2 \ldots dx_K \\&\quad \le \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \dfrac{\prod _{j=1}^K \Gamma (\alpha _{j} + 2\gamma _{ij})}{\Gamma (\sum _{j=1}^K \alpha _{j} + 2\gamma _{ij})}\int \ldots \int \text {exp} (-\delta \Delta ({\textbf{x}}) )\\&\quad \dfrac{\Gamma (\sum _{j=1}^K \alpha _{j} + 2\gamma _{ij})}{\prod _{j=1}^K \Gamma (\alpha _{j} + 2\gamma _{ij})} \prod _{j=1}^K x_{j}^{\alpha _{j} + 2\gamma _{ij} - 1} dx_1 dx_2 \ldots dx_K\\&\quad \le \int \ldots \int x_{i}^2 \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \prod _{j=1}^K x_{j}^{\alpha _j-1} dx_1 dx_2 \ldots dx_K, \end{aligned}$$

$$\begin{aligned} \text {exp} (-2\delta ) E_{\varvec{\alpha }} (X_{i}^2) \le \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \dfrac{\prod _{j \ne i} \Gamma (\alpha _{j}) \Gamma (\alpha _{i}+2)}{\Gamma (\sum _{j=1}^K \alpha _{j}+2)} E_{\varvec{\alpha }+2\varvec{\gamma }_{i}}[\text {exp}(-\delta \Delta ({\textbf{X}}))] \le E_{\varvec{\alpha }} (X_{i}^2), \nonumber \\ \text {exp} (-2\delta ) E_{\varvec{\alpha }} (X_{i}^2) \le \left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j}\right) \left( \dfrac{\alpha _i+1}{\sum _{j=1}^K \alpha _j+1}\right) E_{\varvec{\alpha }+2\varvec{\gamma }_{i}}[\text {exp}(-\delta \Delta ({\textbf{X}}))] \le E_{\varvec{\alpha }} (X_{i}^2). \end{aligned}$$

(14)

By using 9 and 14,

$$\begin{aligned} \text {exp} (-2\delta )E_{\varvec{\alpha }} (X_{i}^2) \le \left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j}\right) \left( \dfrac{\alpha _i+1}{\sum _{j=1}^K \alpha _j+1}\right) \dfrac{E_{\varvec{\alpha }+2\varvec{\gamma }_{i}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]} \le \dfrac{1}{\text {exp} (-2\delta )}E_{\varvec{\alpha }} (X_{i}^2). \end{aligned}$$

(15)

Also,

$$\begin{aligned} E_{\varvec{\alpha }} (X_{i}^2)&= \int \ldots \int x_{i}^2 \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \prod _{j=1}^K x_{j}^{\alpha _j-1} dx_1 dx_2 \ldots dx_K,\nonumber \\&= \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j)} \dfrac{\prod _{j=1}^K \Gamma (\alpha _{j} + 2\gamma _{ij})}{\Gamma (\sum _{j=1}^K \alpha _{j}^ + 2\gamma _{ij}) } \int \ldots \int \nonumber \\&\quad \dfrac{\Gamma (\sum _{j=1}^K \alpha _{j} + 2\gamma _{ij})}{\prod _{j=1}^K \Gamma (\alpha _{j} + 2\gamma _{ij})} \prod _{j=1}^K x_{j}^{\alpha _{j} + 2\gamma _{ij} -1} dx_1 dx_2 \ldots dx_K ,\nonumber \\&= \left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j}\right) \left( \dfrac{\alpha _i+1}{\sum _{j=1}^K \alpha _j+1}\right) . \end{aligned}$$

(16)

Substituting 16 to 14,

$$\begin{aligned}&\text {exp} (-2\delta )\left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j}\right) \left( \dfrac{\alpha _i+1}{\sum _{j=1}^K \alpha _j+1}\right) \le \left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j}\right) \left( \dfrac{\alpha _i+1}{\sum _{j=1}^K \alpha _j+1}\right) \dfrac{E_{\varvec{\alpha }+2\varvec{\gamma }_{i}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\lambda \Delta ({\textbf{X}}))]}\nonumber \\&\quad \le \text {exp} (2\delta )\left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j}\right) \left( \dfrac{\alpha _i+1}{\sum _{j=1}^K \alpha _j+1}\right) ,\nonumber \\&\quad (1 -2 \delta ) \left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j}\right) \left( \dfrac{\alpha _i+1}{\sum _{j=1}^K \alpha _j+1}\right) \le \left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j}\right) \left( \dfrac{\alpha _i+1}{\sum _{j=1}^K \alpha _j+1}\right) \dfrac{E_{\varvec{\alpha }+2\varvec{\gamma }_{i}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}\nonumber \\&\quad \le (1 +2 \delta ) \left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j}\right) \left( \dfrac{\alpha _i+1}{\sum _{j=1}^K \alpha _j+1}\right) . \end{aligned}$$

(17)

We know that, $\text {Var} (X_i) = E(X_{i}^2) - (E(X_i))^2$. Then we compute the lower limit of $\text {Var} (X_i)$ $(\text {Var}_{LL} (X_i))$ and the upper limit of $\text {Var} (X_i) (\text {Var}_{UL} (X_i))$ separately.

$$\begin{aligned} \text {Var}_{LL} (X_i)&= (1 -2 \delta ) \left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j}\right) \left( \dfrac{\alpha _i+1}{\sum _{j=1}^K \alpha _j+1}\right) - \left[ (1 +2 \lambda ) \left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j} \right) \right] ^2,\\&= \left( \dfrac{ \alpha _i}{\left( \sum _{j=1}^K \alpha _j \right) ^2 \left( \sum _{j=1}^K \alpha _j+1\right) } \right) \left[ (1 -2 \delta ) (\alpha _i+1) \left( \sum _{j=1}^K \alpha _j \right) \right. \\&\quad \left. - \alpha _i (1+2\delta )^2 \left( \sum _{j=1}^K \alpha _j+1\right) \right] ,\\&= \left( \dfrac{ \alpha _i}{\left( \sum _{j=1}^K \alpha _j \right) ^2 \left( \sum _{j=1}^K \alpha _j+1\right) } \right) \\&\qquad \times \left[ \left( \sum _{j=1}^K \alpha _j - \alpha _i \right) -2 \delta \left( 2 \delta \alpha _i + \sum _{j=1}^K \alpha _j +3 \alpha _i \sum _{j=1}^K \alpha _j + 2\delta \alpha _i \sum _{j=1}^K \alpha _j + 2 \alpha _i\right) \right] ,\\&= \left( \dfrac{ \alpha _i \left( \sum _{j=1}^K \alpha _j - \alpha _i \right) }{\left( \sum _{j=1}^K \alpha _j \right) ^2 \left( \sum _{j=1}^K \alpha _j+1\right) } \right) \\&\quad - \left( \dfrac{2 \delta \alpha _i}{\left( \sum _{j=1}^K \alpha _j \right) ^2 \left( \sum _{j=1}^K \alpha _j+1\right) } \right) \left( 2 \delta \alpha _i \left( \sum _{j=1}^K \alpha _j +1\right) \right. \\&\quad \left. + (1+\alpha _i) \sum _{j=1}^K \alpha _j + 2 \alpha _i \left( \sum _{j=1}^K \alpha _j +1 \right) \right) , \end{aligned}$$

and

$$\begin{aligned} \text {Var}_{UL} (X_i)&= (1 +2 \delta ) \left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j}\right) \left( \dfrac{\alpha _i+1}{\sum _{j=1}^K \alpha _j+1}\right) - \left[ (1 -2 \delta ) \left( \dfrac{\alpha _i}{\sum _{j=1}^K \alpha _j} \right) \right] ^2,\\&= \left( \dfrac{ \alpha _i}{\left( \sum _{j=1}^K \alpha _j \right) ^2 \left( \sum _{j=1}^K \alpha _j+1\right) } \right) \left[ (1 +2 \delta ) (\alpha _i+1) \left( \sum _{j=1}^K \alpha _j \right) \right. \\&\quad \left. - \alpha _i (1-2\delta )^2 \left( \sum _{j=1}^K \alpha _j+1\right) \right] ,\\&= \left( \dfrac{ \alpha _i}{\left( \sum _{j=1}^K \alpha _j \right) ^2 \left( \sum _{j=1}^K \alpha _j+1\right) } \right) \\&\qquad \times \left[ \left( \sum _{j=1}^K \alpha _j - \alpha _i \right) -2 \delta \left( 2 \delta \alpha _i - \sum _{j=1}^K \alpha _j -3 \alpha _i \sum _{j=1}^K \alpha _j + 2\delta \alpha _i \sum _{j=1}^K \alpha _j - 2 \alpha _i\right) \right] ,\\&= \left( \dfrac{ \alpha _i \left( \sum _{j=1}^K \alpha _j - \alpha _i \right) }{\left( \sum _{j=1}^K \alpha _j \right) ^2 \left( \sum _{j=1}^K \alpha _j+1\right) } \right) \\&\quad - \left( \dfrac{2 \delta \alpha _i}{\left( \sum _{j=1}^K \alpha _j \right) ^2 \left( \sum _{j=1}^K \alpha _j+1\right) } \right) \left( 2 \delta \alpha _i \left( \sum _{j=1}^K \alpha _j +1\right) \right. \\&\quad \left. - (1+\alpha _i) \sum _{j=1}^K \alpha _j -2 \alpha _i \left( \sum _{j=1}^K \alpha _j +1 \right) \right) . \end{aligned}$$

Note that when $\delta \rightarrow 0$, both $\text {Var}_{LL} (X_i)$ and $\text {Var}_{UL} (X_i) \rightarrow \dfrac{\alpha _{i}\left( \sum _{j=1}^K \alpha _{j} - \alpha _{i}\right) }{(\sum _{j=1}^K \alpha _{j})^2 (\sum _{j=1}^K \alpha _{j} +1)}$ which is the variance of the Dirichlet distribution. Next, we discuss the marginal distribution of the smoothed Dirichlet distribution.

2.6 Marginal Distributions

We know that the marginal distributions of the standard Dirichlet distribution are beta distributions. The smoothed Dirichlet (SD) probability density function is

$$\begin{aligned} f({\textbf{x}}|\varvec{\alpha }, \delta ) = \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\prod _{j=1}^K \Gamma (\alpha _j) E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{x}}))]} \prod _{j=1}^K x_{j}^{\alpha _j -1} \text {exp}(-\delta \Delta ({\textbf{x}})). \end{aligned}$$

Let $A=\sum _{j=1}^K \alpha _j$. Then

$$\begin{aligned} f(x_1|\varvec{\alpha }, \delta )&= \dfrac{\Gamma (A)}{\Gamma (\alpha _1) \Gamma (A - \alpha _1) E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{x}}))]} x_{1}^{\alpha _1 -1} (1 - x_{1})^{A - \alpha _1 -1}\text {exp}(-\delta \Delta ({\textbf{x}})),\nonumber \\&= \dfrac{\Gamma (A)}{\Gamma (\alpha _1) \Gamma (A - \alpha _1)} x_{1}^{\alpha _1 -1} (1 - x_{1})^{A - \alpha _1 -1} \dfrac{\text {exp}(-\delta \Delta ({\textbf{x}}))}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{x}}))]}. \end{aligned}$$

(18)

The marginal probability density function of $X_1$ is a product of the probability density function of Beta$(\alpha _1, A - \alpha _1)$ and the ratio of $\dfrac{\text {exp}(-\delta \Delta ({\textbf{x}}))}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{x}}))]}$. Now let’s obtain the marginal distribution of $x_2$. The joint distribution of $x_1$ and $x_2$ is

$$\begin{aligned}{} & {} f(x_1, x_2|\varvec{\alpha }, \delta ) = \dfrac{\Gamma (A)}{\Gamma (\alpha _1) \Gamma (\alpha _2) \Gamma (A - \alpha _1 - \alpha _2) E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{x}}))]}\\{} & {} \quad x_{1}^{\alpha _1 -1} x_{2}^{\alpha _2 -1} (1 - x_{1} - x_{2})^{A - \alpha _1 - \alpha _2 -1}\text {exp}(-\delta \Delta ({\textbf{x}})). \end{aligned}$$

Then

$$\begin{aligned} f(x_2|\varvec{\alpha }, \delta )&= \int _{0}^{1-x_2} \dfrac{\Gamma (A)}{\Gamma (\alpha _1) \Gamma (\alpha _2) \Gamma (A - \alpha _1 - \alpha _2) E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{x}}))]} \\&\quad x_{1}^{\alpha _1 -1} x_{2}^{\alpha _2 -1} (1 - x_{1} - x_{2})^{A - \alpha _1 - \alpha _2 -1}\text {exp}(-\delta \Delta ({\textbf{x}})) dx_1,\\&= \dfrac{\Gamma (A)}{\Gamma (\alpha _1) \Gamma (\alpha _2) \Gamma (A - \alpha _1 - \alpha _2)} \dfrac{x_{2}^{\alpha _2 -1}}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{x}}))]} \\&\quad \int _{0}^{1-x_2} x_{1}^{\alpha _1 -1} (1 - x_{1} - x_{2})^{A - \alpha _1 - \alpha _2 -1}\text {exp}(-\delta \Delta ({\textbf{x}})) dx_1. \end{aligned}$$

Substitute $x_1 = (1 - x_2)u$, then

$$\begin{aligned} f(x_2|\varvec{\alpha }, \delta , u)&= \dfrac{\Gamma (A)}{\Gamma (\alpha _1) \Gamma (\alpha _2) \Gamma (A - \alpha _1 - \alpha _2)} \dfrac{x_{2}^{\alpha _2 -1} (1 - x_{2})^{A - \alpha _2 -1}}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta (x_2, u))]} \\&\quad \int _{0}^{1} u^{\alpha _1 -1} (1 - u)^{A - \alpha _1 - \alpha _2 -1}\text {exp}(-\delta \Delta (x_2,u)) du. \end{aligned}$$

3 Estimation of Parameters and Bayesian Inference

We now outline the estimation of the parameters of the smoothed Dirichlet distribution. We first derive estimators for $\alpha _j, j=1, 2, \ldots , K$ and $\delta$ using the method of moments (MOM).

3.1 Method of Moments (MOM)

Suppose we have a random sample with n random vectors ${\varvec{X}}_1, {\varvec{X}}_2, \ldots , {\varvec{X}}_n$ such that ${\varvec{X}}_{i} = (X_{i1}, X_{i2}, \ldots , X_{iK})^{t}, X_{ij} \ge 0$ and $\sum _{j=1}^K X_{ij} = 1$. We know that the first and second population moments are

$$\begin{aligned} E(X_{j})&= \dfrac{\alpha _{j}}{\sum _{l=1}^K \alpha _{l}} \times \dfrac{E_{\varvec{\alpha +\gamma _{j}}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}, \end{aligned}$$

and

$$\begin{aligned} E(X_{j}^2)&=\dfrac{\alpha _{j}}{\sum _{l=1}^K \alpha _{l}} \times \dfrac{\alpha _{j}+1}{\sum _{l=1}^K \alpha _{l}+1} \times \dfrac{E_{\varvec{\alpha }+2\varvec{\gamma }_{j}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}. \end{aligned}$$

We define the first and second sample moments as

$$\begin{aligned} m_{1j} = \dfrac{1}{n} \sum _{i=1}^n x_{ij}, \end{aligned}$$

and

$$\begin{aligned} m_{2j} = \dfrac{1}{n} \sum _{i=1}^n x_{ij}^2 \end{aligned}$$

for $j=1, 2, \ldots , K$. We have $K-1$ first-order moment equations and $K-1$ second-order moment equations to solve for K unknown $\alpha _j$ and $\delta$. Then

$$\begin{aligned} \dfrac{\alpha _{j}}{\sum _{l=1}^K \alpha _{l}} \times \dfrac{E_{\varvec{\alpha +\gamma _{j}}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]} = \dfrac{1}{n} \sum _{i=1}^n x_{ij}, \end{aligned}$$

(19)

and

$$\begin{aligned} \dfrac{\alpha _{j}}{\sum _{l=1}^K \alpha _{l}} \times \dfrac{\alpha _{j}+1}{\sum _{l=1}^K \alpha _{l}+1} \times \dfrac{E_{\varvec{\alpha }+2\varvec{\gamma }_{j}}[\text {exp}(-\delta \Delta ({\textbf{X}}_i))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}&= \dfrac{1}{n} \sum _{i=1}^n x_{ij}^2\nonumber \\ \dfrac{1}{n} \sum _{i=1}^n x_{ij} \times \dfrac{\alpha _{j}+1}{\sum _{l=1}^K \alpha _{l}+1} \times \dfrac{E_{\varvec{\alpha }+2\varvec{\gamma }_{j}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }+\varvec{\gamma }_{j}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}&= \dfrac{1}{n} \sum _{i=1}^n x_{ij}^2\nonumber \\ \dfrac{\alpha _{j}+1}{\sum _{l=1}^K \alpha _{l}+1} \times \dfrac{E_{\varvec{\alpha }+\varvec{\gamma }_{j}}[X_j\text {exp}(-\delta \Delta ({\textbf{X}}))]}{E_{\varvec{\alpha }+\varvec{\gamma }_{j}}[\text {exp}(-\delta \Delta ({\textbf{X}}))]}&= \dfrac{\sum _{i=1}^n x_{ij}^2}{\sum _{i=1}^n x_{ij}} \end{aligned}$$

(20)

for $j=1, 2, \ldots , K$. There is no closed-form solution for $\alpha _j$ and $\delta$ in solving simultaneously 19 to 20, so we must solve numerically to obtain the corresponding method of moments estimators for the parameters.

3.2 Bayesian Inference

In the Bayesian paradigm, if the posterior distribution is in the same probability distribution family as the prior distribution, then the prior is called a conjugate prior distribution. Like the Dirichlet distribution, the smoothed Dirichlet distribution is also a conjugate prior to the multinomial cell probabilities vector. The distribution of cell counts $({\textbf{X}})$ is given by

$$\begin{aligned} {\textbf{X}} = (X_{1}, X_{2}, \ldots , X_{K})^{t} \sim \text {Multinomial}(n,\text {prob}= {\textbf{p}}), \end{aligned}$$

where ${\textbf{p}} = (p_{1}, p_{2}, \ldots , p_{K})^{t}$ denotes the vector of cell probabilities for the multinomial population. We suggest using the smoothed Dirichlet distribution as the prior distribution for ${\textbf{p}}$,

$$\begin{aligned} {\textbf{p}} = (p_{1}, p_{2}, \ldots , p_{K})^{t} \sim \text {SD} (\varvec{\alpha }, \delta , \Delta ). \end{aligned}$$

Then, the posterior distribution of ${\textbf{p}}$, given the observed counts ($({\textbf{x}})$ is

$$\begin{aligned} {\textbf{p}} | {\textbf{x}}; \varvec{\alpha } \sim \text {SD}(\varvec{\alpha } + {\varvec{x}}, \delta , \Delta ). \end{aligned}$$

We introduce the smoothed Dirichlet-multinomial distribution, a compound distribution of a multinomial distribution, and a smoothed Dirichlet distribution. The marginal distribution of ${\varvec{x}}$ is given by

$$\begin{aligned} f(\varvec{x|\alpha })&= \int f(\varvec{x|p}) f(\varvec{p|\alpha }) d{\varvec{p}}\\&= \int {n \atopwithdelims (){\varvec{x}}} \prod _{j=1}^K p_{j}^{x_j} \times C(\varvec{\alpha }, \delta ) \prod _{j=1}^K p_{j}^{\alpha _j -1} \text {exp}(-\delta \Delta ({\textbf{p}})) d{\varvec{p}}\\&= {n \atopwithdelims (){\varvec{x}}} \dfrac{C(\varvec{\alpha }, \delta )}{C(\varvec{\alpha +x}, \delta )} \int C(\varvec{\alpha +x}, \delta ) \prod _{j=1}^K p_{j}^{x_j+\alpha _j -1} \text {exp}(-\delta \Delta ({\textbf{p}})) d{\varvec{p}}\\&= {n \atopwithdelims (){\varvec{x}}} \dfrac{C(\varvec{\alpha }, \delta )}{C(\varvec{\alpha +x}, \delta )}\\&= {n \atopwithdelims (){\varvec{x}}} \dfrac{\Gamma (\sum _{j=1}^K \alpha _j)}{\Gamma (n+\sum _{j=1}^K \alpha _j)} \prod _{j=1}^K \left( \dfrac{\Gamma (x_j+\alpha _j)}{\Gamma (\alpha _j)} \right) \dfrac{E_{\varvec{\alpha +x}}[\text {exp}(-\delta \Delta ({\textbf{p}}))]}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{p}}))]} \end{aligned}$$

4 Data Analysis

4.1 A Simulation Study

An in-depth simulation study would be useful to demonstrate and compare how the proposed and existing estimators perform over simulated datasets and considering different scenarios. Nevertheless, we performed a brief simulation study using 10000 Monte Carlo simulations, knowing the true cell probabilities. We report the Mean Squared Error (MSE) and compare the estimators below. Also, we varied the number of populations (N - 100, 200, and 500) and the number of categories (K- 3, 6, and 9) to explore the performance of the estimators. We generated data from multinomial distributions with sample sizes ranging from 15 to 75. We considered two scenarios for the true cell probabilities. Here we compare our proposed method with the Dirichlet process method proposed by [10] and the weighted likelihood method by [8]. Note that for the empirical Bayes method, we considered the standard Dirichlet distribution. In the proposed method we considered the penalty function $\Delta = \sum _{j=1}^{K-1} (x_{j+1} - x_{j})^2$ for the smoothed Dirichlet distribution. Also, $\delta$ changes with the scenarios, N and K, and it is ranged from 100 to 300.

4.1.1 Scenario 1

In this scenario, the true cell probabilities are strictly decreasing but the differences between successive ${\textbf{x}}$’s are the same. For example, when $K=3$, the true cell probabilities are ${\textbf{x}} = \left( \dfrac{3}{6}, \dfrac{2}{6}, \dfrac{1}{6}\right) ^{t}$.

Table 1 MSE values for scenario 1

Full size table

Table 1 provides the mean squared error values for each estimator based on Scenario 1. When N is fixed and K increases the MSE decreases. The MSE also decreases when K is fixed and N increases. The Bayesian shrinkage estimator based on a smoothed Dirichlet prior is the best estimator based on the MSE.

4.1.2 Scenario 2

In this scenario, the true cell probabilities are increasing and decreasing (zig-zagag pattern). For example, when $K=3$, the true cell probabilities are ${\textbf{p}} = \left( \dfrac{1}{12}, \dfrac{10}{12}, \dfrac{1}{12}\right) ^{t}$.

Table 2 MSE values for scenario 2

Full size table

Table 2 provides the mean squared error values for each estimator based on Scenario 2. As in previous scenarios, when N is fixed and K increases the MSE decreases. The MSE also decreases when K is fixed and N increases. The estimator based on weighted likelihood approach is the best estimator based on the MSE for this scenario. The Bayes estimator based on the smoothed Dirichlet prior is not doing well in this case, which is not surprising given it isn’t designed to handle this case where successive cell probabilities are different.

4.2 Real Data Analysis

Real-world situations often arise where outcomes in certain categories are not observed due to limited sample size and small, but non-zero, cell probabilities. In such cases, the Maximum Likelihood Estimation (MLE) method may yield poor results by underestimating the actual cell probabilities. However, a proposed approach is highly valuable and applicable in such scenarios, particularly when dealing with ordinal categories that possess a natural ordering. This approach ensures that the borrowing of information among neighboring cell categories is conceptually meaningful, making it an effective and useful methodology for analyzing data applications with missing outcomes in the presence of small probabilities.

Let’s now consider the estimation of ${\textbf{p}}$ using a smoothed Dirichlet prior. For the data application, we consider 2018 Major League Baseball (MLB) batting data from the Baseball-Reference website (www.baseball-reference.com). We consider data for all the regular season games taking place between March 29, 2018, and October 12, 2018. Our analysis includes $m=556$ players with at least 25 plate appearances. [8] proposed an estimator based on the weighted likelihood approach to predict a good baseball batting metric for each batter, especially when a batter has a few plate appearances. This estimator borrows information from other similar batters to make inferences about a target. Our proposed Bayesian estimator using smoothed Dirichlet prior distribution borrows information across other batters but, more importantly, also across neighboring ordinal categories (batting outcomes) to improve the estimation of cell probabilities. [9] used this proposed Bayesian estimator using smoothed Dirichlet prior distribution to estimate the distribution of positive COVID-19 cases across age groups for Canadian health regions.

In our analysis, we consider $K=11$ possible outcomes to batting; SO - strikeout, GO - ground out, AO - air out, SH - sacrifice hit, SF - sacrifice fly, HBP - hit by a pitch, BB - bases on balls/walk, S - single, D - double, T - triple and HR - home run. The outcome of batting in baseball can be divided into discrete categories; this is the basis for constructing metrics that evaluate the batting performance of players. It is also the basis of our analysis. Let $x_{ij}$ be the number of plate appearances in which the batting outcome j occurs for the $i^{\text {th}}$ batter $(j=1, 2, \ldots , K)$ and denote the number of plate appearances for the $i^{\text {th}}$ batter by $n_i$. In this paper, we considered $K=11$ batting outcomes, and the joint distribution of the counts for these 11 discrete categories for batter i is given by

$$\begin{aligned} \begin{array}{lcl} \mathbf {x_i} = (x_{i1}, x_{i2}, \ldots ,x_{i11})^{t}\sim & {} \text {Multinomial}(n_i,\mathbf {p_i}), \end{array} \end{aligned}$$

where $\mathbf {p_i} = (p_{i1}, p_{i2}, \ldots ,p_{i11})^{t}$ represents the vector of outcome specific probabilities satisfying $\displaystyle \sum _{j} p_{ij}=1$. Taking a Bayesian approach, assume that

$$\begin{aligned} \begin{array}{lcl} \mathbf {p_i} = (p_{i1}, p_{i2}, \ldots , p_{i11})^{t}\sim & {} \text {SD}(\varvec{\alpha } = (\alpha _1, \alpha _2, \ldots , \alpha _{11})^{t}, \delta , \Delta ). \end{array} \end{aligned}$$

Then, the posterior distribution for the $i{\text {th}}$ batter is given by

$$\begin{aligned} \mathbf {p_i} | {\textbf{x}}; \varvec{\alpha } \sim \text {SD}(\varvec{\alpha } + {\varvec{x}}, \delta , \Delta ). \end{aligned}$$

It is clear that there are mainly two types of groups; strikeout, ground out, and air out are outs/dismissals, and single, double, triple, and home run are hits. Borrowing of information across cell categories within each group only. For our analysis, we use $\Delta = \sum _{j=1}^{K-1} (p_{i(j+1)} - p_{ij})^2$. We modify this penalty function slightly so that the borrowing of information across cell categories is done within each group only. Assuming the batting outcomes are arranged in the given order (SO, GO, $\ldots$, HR) as above, the modified penalty function is given by

$$\begin{aligned} \Delta = (p_{i2}-p_{i1})^2+(p_{i3}-p_{i2})^2 + (p_{i9}-p_{i8})^2+(p_{i10}-p_{i9})^2+(p_{i11}-p_{i10})^2. \end{aligned}$$

Fig. 3 shows the estimates of the cell probabilities for the top 10 batters using smoothed Dirichlet prior with different $\delta$ that are very close to the MLE. This behavior was to be expected, given the top 10 batters have a large number of plate appearances: in their case, when $\delta$ increases, we can see small fluctuations from the MLEs. On the other hand, Fig. 4 shows the estimates of the cell probabilities for the last 10 batters are very close to the overall proportion ${\bar{p}}_j = \frac{\sum _{i=1}^{556} x_{ij}}{\sum _{i=1}^{556} \sum _{j=1}^{11} x_{ij}}$. For the last ten batters that only have a small number of plate appearances, when $\delta$ increases, we can see large fluctuations from the MLEs.

5 Conclusions

The proposed smoothed Dirichlet distribution constitutes a superior alternative for borrowing information among neighboring cells. The proposed smoothed Dirichlet distribution forces the probabilities of neighboring cells to be closer to each other than under the standard Dirichlet distribution. Inference results were in close agreement with the behavior we expected for the data application. This data application also shows that the smoothed Dirichlet distribution is flexible enough to accommodate different ways of borrowing information from the cells. Future research may be aimed at determining a suitable $\delta$ for the smoothed Dirichlet distribution.

Data Availability

All data generated or analyzed during this study are included in this published article.

References

Guardiola, J.H.: The spherical-Dirichlet distribution. J Stat Distrib Appl (2020). https://doi.org/10.1186/s40488-020-00106-9
Article MATH Google Scholar
Hjort, N. L.: Bayesian Statistics 5, Chapter Bayesian approaches to non- and semi-parametric density estimation, pp. 223–254, (1996)
Najar, F., Bouguila, N.: Image categorization using agglomerative clustering based smoothed Dirichlet mixtures. In: Yen, Y. (ed.) Advances in visual computing, pp. 27–38. Springer International Publishing (2020)
Google Scholar
Nallapati, R.: The Smoothed Dirichlet Distribution: Understanding Cross-Entropy Ranking in Information Retrieval. Ph. D. thesis (2006)
Ng, K.W., Tang, M., Tan, M., Tian, G.: Grouped Dirichlet distribution: a new tool for incomplete categorical data analysis. J Multivar Anal 99(3), 490–509 (2008)
Article MathSciNet MATH Google Scholar
Ng, K.W., Tang, M., Tan, M., Tian, G.: The nested Dirichlet distribution and incomplete categorical data analysis. Stat Sin 19(1), 251–271 (2009)
MathSciNet MATH Google Scholar
Tu, K.: Modified Dirichlet distribution: allowing negative parameters to induce stronger sparsity. In Conference on Empirical Methods in Natural Language Processing (EMNLP 2016), Austin, Texas, USA (2016)
Wickramasinghe, L., Leblanc, A., Muthukumarana, S.: Model-based estimation of baseball batting metrics. J Appl Stat 48(10), 1–23 (2020)
MathSciNet MATH Google Scholar
Wickramasinghe, L., Leblanc, A., Muthukumarana, S.: Bayesian inference on sparse multinomial data using smoothed Dirichlet distribution with an application to covid-19 data. Model Assisted Statistics and Applications 18 (3), (2023)
Wickramasinghe, L., Leblanc, A., Muthukumarana, S.: Semi-parametric Bayesian estimation of sparse multinomial probabilities with an application to the modeling of bowling performance in T20I cricket. Ann Biostat Biometric Appl 5(1), 1–13 (2023)
Google Scholar
Wong, T.T.: Generalized Dirichlet distribution in Bayesian analysis. Appl Math Comput 97(2), 165–181 (1998)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

I would like to sincerely thank the reviewers for their valuable comments, which have greatly improved the quality and impact of our work.

Funding

No funding was received for conducting this study.

Author information

Authors and Affiliations

Department of Mathematics and Statistics, University of Winnipeg, Winnipeg, Canada
Lahiru Wickramasinghe
Department of Statistics, University of Manitoba, Winnipeg, Canada
Alexandre Leblanc & Saman Muthukumarana

Authors

Lahiru Wickramasinghe
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre Leblanc
View author publications
You can also search for this author in PubMed Google Scholar
Saman Muthukumarana
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The analyses were performed by LW. All of the authors (LW, AL, and SM were involved in the study concept and design, and interpretation of the data. All of the authors read and approved the final manuscript.

Corresponding author

Correspondence to Lahiru Wickramasinghe.

Ethics declarations

Conflict of Interest

There were no competing interests to declare that arose during the preparation or publication process of this article.

Ethical Approval and Consent to Participate

Not applicable.

Consent for Publication

Not applicable.

Appendix

1.1 Exponential Family

A random variable x follows an exponential family of distribution if the probability density function can be written in the following form:

$$\begin{aligned} p({\varvec{x}}|\varvec{\eta }) = h({\varvec{x}}) \text {exp} \left\{ \varvec{\eta }^{t} T({\varvec{x}}) - A(\varvec{\eta }) \right\} \end{aligned}$$

where

$\varvec{\eta }$ is a vector of parameters,
$T({\varvec{x}})$ is the sufficient statistics,
$A(\varvec{\eta })$ is the cumulant function.

The smoothed Dirichlet distribution also belongs to the exponential family. The probability density function can be written as

$$\begin{aligned} p({\varvec{x}}|\varvec{\alpha }, \delta , \Delta ) = \text {exp} \left\{ \sum _{j=1}^K(\alpha _j -1) \text {log} x_j - \delta \Delta ({\varvec{x}}) + \text {log} C(\varvec{\alpha }, \delta )\right\} \end{aligned}$$

where

$\varvec{\eta } = (\varvec{\alpha }, \delta )$,
$T({\varvec{x}}) = (\text {log}({\varvec{x}}), \Delta ({\varvec{x}}))$,
$A(\varvec{\eta }) = -\text {log} C(\varvec{\alpha }, \delta )$.

1.2 Derivations of the Marginal Distributions

The marginal distribution of $X_1$ is

$$\begin{aligned} f(x_1|\varvec{\alpha }, \delta )&= \dfrac{\Gamma (A)}{\Gamma (\alpha _1) \Gamma (A - \alpha _1)} x_{1}^{\alpha _1 -1} (1 - x_{1})^{A - \alpha _1 -1} \dfrac{\text {exp}(-\delta \Delta ({\textbf{x}}))}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{x}}))]}. \end{aligned}$$

If we use $\Delta ({\textbf{x}}) = \sum _{j=1}^{K-1} (x_{j+1} - x_j)^2$, then $\Delta ({\textbf{x}}) = (1--2x_1)^2$. Also,

$$\begin{aligned} E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta ({\textbf{x}}))] = \dfrac{\Gamma (A)}{\Gamma (\alpha _1) \Gamma (A - \alpha _1)} \int _{0}^1 x_{1}^{\alpha _1 -1} (1 - x_{1})^{A - \alpha _1 -1} \text {exp}(-\delta (1-2x_1)^2) dx_1. \end{aligned}$$

The marginal distribution of $X_2$ is

$$\begin{aligned} f(x_2|\varvec{\alpha }, \delta , u)&= \dfrac{\Gamma (A)}{\Gamma (\alpha _1) \Gamma (\alpha _2) \Gamma (A - \alpha _1 - \alpha _2)} \dfrac{x_{2}^{\alpha _2 -1} (1 - x_{2})^{A - \alpha _2 -1}}{E_{\varvec{\alpha }}[\text {exp}(-\delta \Delta (x_2, u))]} \\&\quad \int _{0}^{1} u^{\alpha _1 -1} (1 - u)^{A - \alpha _1 - \alpha _2 -1}\text {exp}(-\delta \Delta (x_2,u)) du. \end{aligned}$$

Here $\Delta (x_2, u) = \dfrac{(1-x_2)^2}{2} (1--2u)^2 + \dfrac{1}{2}(1--3x_2)^2$. Then

$$\begin{aligned} \text {exp}(-\delta \Delta (x_2, u))&= \text {exp} \left( - \dfrac{\delta (1-x_2)^2}{2} (1-2u)^2 - \dfrac{\delta }{2}(1-3x_2)^2\right) \\&= \text {exp} \left( - \dfrac{\delta (1-x_2)^2}{2} (1-2u)^2\right) \times \text {exp} \left( -\dfrac{\delta }{2}(1-3x_2)^2\right) . \end{aligned}$$

Also

$$\begin{aligned} E_{\varvec{\alpha }} [\text {exp}(-\delta \Delta (x_2, u))]&= \dfrac{\Gamma (A)}{\Gamma (\alpha _1) \Gamma (\alpha _2) \Gamma (A - \alpha _1 - \alpha _2)} \\&\quad \int \int (1-x_2)^{A-\alpha _2 -2}x_{2}^{\alpha _2-1} u^{\alpha _1-1} (1-u)^{A-\alpha _1 -\alpha _2-1} \text {exp}(-\delta \Delta (x_2, u)) dx_2 du\\&= \dfrac{\Gamma (A)}{\Gamma (\alpha _1) \Gamma (\alpha _2) \Gamma (A - \alpha _1 - \alpha _2)} \\&\quad \int (1-x_2)^{A-\alpha _2 -2}x_{2}^{\alpha _2-1} \text {exp} \left( -\dfrac{\delta }{2}(1-3x_2)^2\right) dx_2\\&\int u^{\alpha _1-1} (1-u)^{A-\alpha _1 -\alpha _2-1} \text {exp} \left( - \dfrac{\delta (1-x_2)^2}{2} (1-2u)^2\right) du \\ \end{aligned}$$

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Wickramasinghe, L., Leblanc, A. & Muthukumarana, S. Smoothed Dirichlet Distribution. J Stat Theory Appl 22, 237–261 (2023). https://doi.org/10.1007/s44199-023-00062-8

Download citation

Received: 31 March 2023
Accepted: 24 August 2023
Published: 11 September 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s44199-023-00062-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Smoothed Dirichlet Distribution

Abstract

Similar content being viewed by others

A fast look-up method for Bayesian mean-parameterised Conway–Maxwell–Poisson regression models

Simultaneous confidence bands for the distribution function of a finite population and of its superpopulation

A new discrete XLindley distribution: theory, actuarial measures, inference, and applications

1 Introduction

2 Basic Properties

2.1 Probability Density Function

2.2 Moments

2.3 Sample Generation

2.4 Role of the Penalty Term (\(\text {Exp}(-\delta \Delta ({\textbf{x}}))\))

2.5 The Upper and Lower Limits for \(E(X_i)\) and \(\text {Var} (X_i)\)

2.6 Marginal Distributions

3 Estimation of Parameters and Bayesian Inference

3.1 Method of Moments (MOM)

3.2 Bayesian Inference

4 Data Analysis

4.1 A Simulation Study

4.1.1 Scenario 1

4.1.2 Scenario 2

4.2 Real Data Analysis

5 Conclusions

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interest

Ethical Approval and Consent to Participate

Consent for Publication

Appendix

Appendix

1.1 Exponential Family

1.2 Derivations of the Marginal Distributions

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation